{
  "version": 3,
  "sources": ["ssg:https://framerusercontent.com/modules/YVgynkVt96a0ES9Gk8yz/e2OiwG21hMIpRxQ1FbTF/eo4RAmtig-32.js"],
  "sourcesContent": ["import{jsx as e,jsxs as t}from\"react/jsx-runtime\";import{Link as a}from\"framer\";import{motion as i}from\"framer-motion\";import*as n from\"react\";export const richText=/*#__PURE__*/t(n.Fragment,{children:[/*#__PURE__*/e(\"p\",{children:\"Data is an essential building block of artificial intelligence and data labeling is a key step in developing high-performing machine learning models. In short, data labeling enables these algorithms to build an accurate understanding of the world around us \u2014 and it's showing no signs of slowing down anytime soon.\"}),/*#__PURE__*/e(\"p\",{children:\"As we move toward greater digitalization and automation in our day-to-day activities, data \u2014 and its proper classification \u2014 is becoming increasingly critical to our collective success.\"}),/*#__PURE__*/e(\"p\",{children:\"If you'd like to find out more about labeled data and unlabeled data, you've come to the right place. In this post, we cover a variety of related topics to give you a high-level overview. To start off, we take a closer look at the basic idea of how labeled and unlabeled data are defined, then we go on to analyze the difference between labeled and unlabeled data, subsequently taking a deeper dive into labeled and unlabeled data in machine learning. Let's jump right in and get started.\"}),/*#__PURE__*/e(\"h2\",{children:\"What is data labeling and how does it work?\"}),/*#__PURE__*/e(\"p\",{children:\"Before we get into the weeds, it's important to have some background knowledge about how data labeling works. So, let's begin with the basics.\"}),/*#__PURE__*/e(\"p\",{children:\"Under the umbrella of AI and computer science, machine learning uses data and algorithms to imitate the way humans learn, while gradually improving in accuracy. In machine learning, data labeling is the process of identifying raw data (images, text, videos, and so on) and adding one or more labels to provide context so that a model can learn from it. For example, labels help to identify the content of an image, speech in an audio recording, or what's shown on an x-ray.\"}),/*#__PURE__*/e(\"p\",{children:'To create a label, humans are asked to make judgments about a piece of unlabeled data. For example, they take a look at a picture (a data point) and answer the question: \"Is this a picture of a cat or a dog?\"'}),/*#__PURE__*/e(\"p\",{children:\"These labels serve a vital function in helping machine learning models make the best predictions \u2014 just as their human counterparts who are responsible for creating, training, fine-tuning, and testing these models. Ultimately, data annotators help guide the data labeling process by creating labeled datasets that are most relevant to a particular project.\"}),/*#__PURE__*/e(\"h2\",{children:\"What\u2019s the difference between labeled and unlabeled data?\"}),/*#__PURE__*/e(\"p\",{children:\"Before we dive into the question at hand, let's first start off by defining labeled data and unlabeled data.\"}),/*#__PURE__*/e(\"p\",{children:\"So, labeled vs unlabeled data \u2014 what's the difference?\"}),/*#__PURE__*/t(\"ul\",{children:[/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Labeled data contains meaningful tags and is used in supervised learning, while unlabeled data doesn\u2019t contain additional information and is used in unsupervised learning.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Labeled data requires the additional process of labeling, while unlabeled data is essentially raw data before labeling.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Labeled data is harder to obtain (there are less datasets available, or you have to label it yourself), whereas unlabeled data is more abundant.\"})})]}),/*#__PURE__*/e(\"p\",{children:\"Now, let's dive into the details.\"}),/*#__PURE__*/e(\"h3\",{children:\"Labeled data\"}),/*#__PURE__*/e(\"p\",{children:\"With the help of human annotators, labeled data enhances a set of unlabeled data with meaningful tags, labels, or classes. Once a labeled dataset is created, a machine learning model can be fed this labeled dataset so that when it encounters new unlabeled data, it can accurately predict and assign an appropriate label to that data.\"}),/*#__PURE__*/t(\"ul\",{children:[/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:'Labeled data is used in supervised machine learning \u2014 a machine learning approach in which labeled datasets are used to train or \"supervise\" a machine learning algorithm in categorizing data or making accurate predictions (the model can measure its accuracy and learn over time by using labeled inputs and outputs).'})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"It\u2019s harder to obtain and store (can be time consuming and costly).\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"It can be used to identify actionable insights, such as predictions.\"})})]}),/*#__PURE__*/e(\"p\",{children:\"Supervised learning can further be broken down into two subsets:\"}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"strong\",{children:\"Classification\"}),\": Using algorithms to correctly assign test data to specific categories, such as separating junk mail from your inbox.\"]}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"strong\",{children:\"Regression\"}),\": Using algorithms to understand the relationship between dependent and independent variables and forecasting numbers based on different data points, such as sales revenue projections.\"]}),/*#__PURE__*/e(\"h3\",{children:\"Unlabeled data\"}),/*#__PURE__*/e(\"p\",{children:\"Unlabeled data on the other hand, doesn't have any meaningful tags or labels and usually consists of natural or human-created samples such as photos, audio recordings, videos, news articles, tweets, or x-rays that can be easily obtained.\"}),/*#__PURE__*/e(\"p\",{children:\"Computers use labeled and unlabeled data to train machine learning models, but what's the difference?\"}),/*#__PURE__*/t(\"ul\",{children:[/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Unlabeled data is used in unsupervised machine learning \u2014 applies ML algorithms to analyze and cluster unlabeled data sets by uncovering patterns without the help of humans.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"It\u2019s easier to obtain and store.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"It doesn't have as many uses (however, unsupervised learning methods can help uncover new data clusters for additional categories).\"})})]}),/*#__PURE__*/e(\"p\",{children:\"Unsupervised learning models are used for three main tasks:\"}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"strong\",{children:\"Clustering\"}),\": Grouping unlabelled data based on similarities or differences, as seen in market segmentation, image compression, etc.\"]}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"strong\",{children:\"Association\"}),\": Using different rules to find relationships between variables in a dataset, as used in market basket analysis and product recommendations.\"]}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"strong\",{children:\"Dimensionality reduction\"}),\": Applied when the number of features (or dimensions) in a dataset is too high in the preprocessing data stage, such as improving image quality by removing noise.\"]}),/*#__PURE__*/e(\"h2\",{children:\"Breaking down the learning patterns further\"}),/*#__PURE__*/e(\"p\",{children:\"Wondering what the main difference is between supervised and unsupervised learning? Labeled data. To put it briefly, supervised learning uses labeled input and output data, whereas unsupervised learning algorithms do not.\"}),/*#__PURE__*/e(\"p\",{children:\"It's also helpful to note that unsupervised learning has more complex algorithms since we don't know much about the data or anticipated outcomes. With a smaller number of models and fewer ways to check for accuracy, unsupervised learning creates a less controlled environment since the machine is generating outcomes for us.\"}),/*#__PURE__*/t(\"p\",{children:[\"On the other hand, \",/*#__PURE__*/e(\"em\",{children:\"semi-supervised learning\"}),\" combines unlabeled and labeled data (or sets of unlabeled data where only a few pieces of data have labels) into integrated models. There's a lot of research in this area on ways to build better and more accurate real-world models.\"]}),/*#__PURE__*/t(\"p\",{children:[\"The third basic machine learning technique, \",/*#__PURE__*/e(\"em\",{children:\"reinforcement learning\"}),\", enables a model to learn in an interactive environment by trial and error using feedback from the environment.\"]}),/*#__PURE__*/t(\"p\",{children:[\"Reinforcement learning is often associated with the \",/*#__PURE__*/e(\"em\",{children:\"Markov Decision Process\"}),\" (MDP), which provides a mathematical framework for modeling decision making in situations where outcomes are partially random. It's often used for studying optimization problems to learn from an interaction and achieve a specific goal.\"]}),/*#__PURE__*/e(\"h2\",{children:\"Reinforcement versus supervised and unsupervised learning: What's the difference?\"}),/*#__PURE__*/e(\"p\",{children:\"So, how does reinforcement learning differ from supervised and unsupervised learning?\"}),/*#__PURE__*/e(\"p\",{children:\"Unlike supervised learning where the feedback provided is based on a correct set of actions for performing a task, reinforcement learning doesn't require any labeled input or output pairings, nor does it need any actions to be corrected. Instead, reinforcement learning uses rewards and penalties to indicate positive and negative actions.\"}),/*#__PURE__*/e(\"p\",{children:\"Compared to unsupervised learning, reinforcement learning has a different objective: to find a suitable action model that maximizes the total cumulative reward \u2014 whereas the aim of unsupervised learning is to find similarities and differences between pieces of data.\"}),/*#__PURE__*/e(\"h2\",{children:\"How can labeled and unlabeled data be used?\"}),/*#__PURE__*/e(\"p\",{children:\"Now that we have a clearer understanding of the differences between labeled data and unlabeled data, let's look at how they can be used. Because of their differences, some machine learning algorithms can only work with a labeled dataset while others can only work with unlabeled data.\"}),/*#__PURE__*/e(\"p\",{children:\"This depends on several factors including the type of task, the primary aim of the task, the availability of data, the degree of general versus specific knowledge needed to carry out the required data annotation and tagging, and the overall complexity of the decision-making process.\"}),/*#__PURE__*/e(\"p\",{children:\"As mentioned, labeled data corresponds to regression and classification tasks, which fall under the category called supervised learning. These include:\"}),/*#__PURE__*/t(\"ul\",{children:[/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Predicting unseen values.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Mapping the relationship between two variables.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Testing scientific hypotheses.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Entity recognition via computer vision and speech-to-text systems.\"})})]}),/*#__PURE__*/e(\"p\",{children:\"Whereas unlabeled data is associated with clustering and dimensionality reduction tasks, which fall under the category called unsupervised learning. These include:\"}),/*#__PURE__*/t(\"ul\",{children:[/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Identifying subsets of observations that share common characteristics.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Decreasing the complexity of a dataset to reduce the resources needed to process it.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Standardizing a dataset to train neural networks (known as feature scaling).\"})})]}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Note\"}),\": a neural network is a specific type of machine learning model that teaches computers to process data in a way that resembles the human brain.\"]}),/*#__PURE__*/e(\"p\",{children:\"Unlabeled data used in unsupervised learning extracts insights based solely on the quantitative characteristics of datasets. Since it requires little previous knowledge, the objectives aren't that complex \u2014 they may include:\"}),/*#__PURE__*/t(\"ul\",{children:[/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Reducing the dimensionality of a dataset to limit the resources needed to train neural networks.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:'Developing a neural network that encodes a dataset into a higher abstract representation (known as an \"autoencoder\").'})})]}),/*#__PURE__*/e(\"p\",{children:\"Supervised learning often has more varied goals, which might include:\"}),/*#__PURE__*/t(\"ul\",{children:[/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Recognizing objects in images.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Predicting the value of stocks.\"})})]}),/*#__PURE__*/e(\"h2\",{children:\"Types of data labeling in machine learning models\"}),/*#__PURE__*/e(\"p\",{children:\"Data labeling has a lot of different uses. Many have to do with computer vision, natural language processing, speech recognition, and audio processing. Let's take a closer look at these three main categories.\"}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"strong\",{children:\"Computer Vision\"}),\" allows applicable systems to obtain data from digital input like images and videos, then take action or make recommendations according to that input.\"]}),/*#__PURE__*/e(\"p\",{children:\"Data labeling is the beginning phase of generating a training dataset. Some initial tasks could include labeling images or key points. You could also use a bounding box \u2014 an enclosed border \u2014 to select an object or group similar elements in an image.\"}),/*#__PURE__*/e(\"p\",{children:\"Another option is to categorize images by type or content. You can also segregate by pixels. From there, this training dataset serves as the foundation for a computer vision model, which has many uses such as classifying images, uncovering the whereabouts of objects, determining key points, and more.\"}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"strong\",{children:\"Natural Language Processing\"}),\" aims to build machines that are able to comprehend and respond to written or spoken words in the same way that humans are able to.\"]}),/*#__PURE__*/e(\"p\",{children:\"To create a training dataset, either pinpoint the primary parts of text you want to highlight or use tags with distinct labels. Examples include determining parts of speech, classifying individual names and locations, or discerning text in images.\"}),/*#__PURE__*/e(\"p\",{children:\"To do this, you can use bounding boxes to outline the text and copy the written words into text format. Some use cases for these models include optical character recognition, entity name recognition, and sentiment analysis.\"}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"strong\",{children:\"Audio Processing\"}),\" is central to recording, enhancing, storing, and transmitting audio content.\"]}),/*#__PURE__*/e(\"p\",{children:\"Also used to remove unwanted noise, add effects, boost frequency ranges, and more, this process converts different sounds such as dialects, construction or animal noises into recognizable patterns that can be incorporated into machine learning. You'll need to describe and write out these sounds (for example, a dog barking, a bird chirping, or an alarm going off). Then, you can delve deeper by adding tags and classifying the auditory parts \u2014 this serves as your training dataset.\"}),/*#__PURE__*/e(\"h2\",{children:\"Best ways to label data\"}),/*#__PURE__*/e(\"p\",{children:\"You can make data labeling better, faster, and more precise in several ways, such as:\"}),/*#__PURE__*/t(\"ul\",{children:[/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Developing insightful and efficient task interfaces for the people who will be classifying your data. The more streamlined your labeling process is, the more efficient your labeling efforts are. This is especially noticeable when labeling huge amounts of data.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Utilizing various methods of aggregating data to avoid errors and offset personal biases. Creating agreement among labelers can be done by collecting and consolidating feedback from many people into a single label.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Evaluating labels to check for correctness by giving evaluation tasks to a different group of labelers. They can check the correctness of the initial labeling performed by other people.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Applying active learning to determine which additional data needs to be labeled.\"})})]}),/*#__PURE__*/e(\"h2\",{children:\"How can we streamline data labeling?\"}),/*#__PURE__*/e(\"p\",{children:\"To build high-performing machine learning models, you need high quality data. Getting ahold of this data can be costly, complex, and inefficient. Labels created by people are needed for the majority of models to help them generate the correct predictions. To help streamline and automate this process, you can apply a machine learning model to label the data directly.\"}),/*#__PURE__*/e(\"p\",{children:\"Firstly, a machine learning model is trained on a subset of raw training data that has already been labeled by humans. A model with a track record of producing precise outcomes from the information that it has learned thus far, can add labels to unlabeled data automatically. A less accurate model requires human annotators to add labels. Labels created by humans enable the model to learn and enhance its capacity to categorize new data.\"}),/*#__PURE__*/e(\"p\",{children:\"Eventually, the model is able to label an increasing amount of data automatically and speed up the creation of training datasets. Of course, implementing quality control in such models is also a necessity, as with time it might drift and start producing less accurate results. In this case, human annotators can step in again.\"}),/*#__PURE__*/t(\"p\",{children:[\"Internal labeling (in-house), synthetic labeling (generating new data from previous datasets), programmatic labeling (using scripts), outsourcing (or freelancing) constitute a variety of data labeling methods. However, our favorite is obviously crowdsourcing \u2014 a great way to outsource data labeling and get around the drawn-out and expensive management processes. Check out our \",/*#__PURE__*/e(a,{href:\"https://toloka.ai/data-labeling-platform/\",motionChild:!0,nodeId:\"eo4RAmtig\",openInNewTab:!1,relValues:[],scopeId:\"contentManagement\",smoothScroll:!1,children:/*#__PURE__*/e(i.a,{children:\"data labeling platform\"})}),\" to learn more!\"]}),/*#__PURE__*/e(\"h2\",{children:\"To sum up\u2026\"}),/*#__PURE__*/e(\"p\",{children:\"At its basics, data labeling provides a way to categorize data by assigning an appropriate tag or label to raw data \u2014 examples of which include pictures, written words, as well as video and audio recordings.\"}),/*#__PURE__*/e(\"p\",{children:\"Data labeling gives meaning and context for machine learning models, which apply this data to generate better and more exact predictions.\"}),/*#__PURE__*/e(\"p\",{children:\"There are a lot of uses for labeled data across computer vision, natural language processing, and speech recognition. Countless companies across industries combine the software, processes, and human efforts of data scientists and annotators to sort, categorize, and label data \u2014 which essentially turns into a training dataset for machine learning models.\"}),/*#__PURE__*/e(\"p\",{children:\"As alluded to earlier, data labeling can either be carried out manually (by a human) or automatically (by a machine) \u2014 both have pros and cons. Manual labeling by humans is rather expensive in both the financial and time sense, but crowdsourcing provides a great alternative.\"}),/*#__PURE__*/t(\"p\",{children:[\"By using a \",/*#__PURE__*/e(a,{href:\"https://toloka.ai/data-labeling-platform/\",motionChild:!0,nodeId:\"eo4RAmtig\",openInNewTab:!1,relValues:[],scopeId:\"contentManagement\",smoothScroll:!1,children:/*#__PURE__*/e(i.a,{children:\"crowdsourcing platform\"})}),\" like Toloka for data labeling tasks, you can successfully tap into the wisdom of the crowd on a large scale \u2014 or earn some extra cash doing fun micro tasks as a fellow Toloker. With countless annotators around the world carrying out tasks posted by AI teams and businesses alike, our platform gives individuals and corporations alike the necessary tools to oversee data labeling quality and construct a streamlined pipeline for their machine learning tasks.\"]}),/*#__PURE__*/t(\"p\",{children:[\"To get up to speed on all things related to data labeling, machine learning, and AI, we invite you to visit our \",/*#__PURE__*/e(a,{href:\"https://toloka.ai/blog/\",motionChild:!0,nodeId:\"eo4RAmtig\",openInNewTab:!1,relValues:[],scopeId:\"contentManagement\",smoothScroll:!1,children:/*#__PURE__*/e(i.a,{children:\"blog\"})}),\".\"]}),/*#__PURE__*/e(\"h2\",{children:\"About Toloka\"}),/*#__PURE__*/e(\"p\",{children:/*#__PURE__*/e(\"em\",{children:\"Toloka is a European company based in Amsterdam, the Netherlands that provides data for Generative AI development. Toloka empowers businesses to build high quality, safe, and responsible AI. We are the trusted data partner for all stages of AI development from training to evaluation. Toloka has over a decade of experience supporting clients with its unique methodology and optimal combination of machine learning technology and human expertise, offering the highest quality and scalability in the market.\"})})]});export const richText1=/*#__PURE__*/t(n.Fragment,{children:[/*#__PURE__*/e(\"p\",{children:\"Machine learning is one of the fields of artificial intelligence that allows us to train computers without resorting to programming all the business logic directly. AI already plays a huge role in the development of computer science and the IT industry today.\"}),/*#__PURE__*/e(\"p\",{children:\"Businesses are paying more and more attention to intelligent tools in order to develop their operations. It may be implemented in any software tool, as well as in research, manufacturing processes, and actual products. To understand how it all works first it's necessary to explore some essential terminology, types of machine learning, and its algorithms.\"}),/*#__PURE__*/e(\"p\",{children:\"Before we begin explaining what this or that ML term stands for, let's think about why we need to know it. The answer may be very simple: machine learning, and artificial intelligence in general, is practically our everyday life now.\"}),/*#__PURE__*/e(\"p\",{children:\"ML algorithms solve complex tasks like natural language processing (think automatic translators), sentiment analysis and content moderation, they do optical character recognition to enable text detection in images, and computer vision helps detect various objects in pictures and videos. Many modern smartphones, computers, programs, and even games today employ machine learning.\"}),/*#__PURE__*/e(\"p\",{children:\"Machine learning entails that the computer recognizes patterns from examples. This does not involve programming with a certain set of so-called guidelines. Such patterns are contained in the data. Machine learning involves creating algorithms or a set of rules which learn from known patterns in the data and then make predictions.\"}),/*#__PURE__*/e(\"p\",{children:\"In general, the goal of machine learning is to predict the result from some input data. For the system to learn how to make more accurate predictions, experts must first tell the computer the correct answers so that it can learn them. Although not all types of ML require a high quality training data set, which will be discussed further in more detail.\"}),/*#__PURE__*/e(\"p\",{children:\"The more diverse and accurate the training data, the easier it is for the machine to find patterns and the more accurate the outcome. So, the fundamental principle of ML is that machines receive data and learn from it. Artificial Intelligence aims to learn how to perform various tasks, and in doing so allows it to make predictions or decisions based on previous data with minimal or no instruction from humans. However, before a computer can learn to make decisions like a human, a human has to teach it in most cases.\"}),/*#__PURE__*/e(\"p\",{children:\"To enable us to incorporate the results of ML into our day-to-day activities, teams of data scientists and engineers create machine learning models. An ML model is essentially a piece of software (basically, a file containing all the layers and weights) that is trained to identify multiple correlations or patterns in a dataset. Normally, ML engineers train a model from datasets using a machine learning algorithm, which the model uses to analyze and remember that learning data.\"}),/*#__PURE__*/e(\"h2\",{children:\"The process of building a machine learning model\"}),/*#__PURE__*/e(\"p\",{children:\"To launch the machine learning process, you first have to obtain a certain amount of labeled data, called a dataset. The algorithm will learn to work with them. For example, it can be photos of dogs and cats, which already have labels identifying these animals in the image.\"}),/*#__PURE__*/e(\"p\",{children:\"After the training phase, the model will be able to recognize the dogs and cats in the new images without the labels. The process of training continues even after the predictions have been generated; ideally, the more labeled data will be analyzed by the computer, the more accurately it will be able to recognize animals on new images.\"}),/*#__PURE__*/e(\"p\",{children:\"The basic representation of the entire machine learning model preparation involves approximately 7 steps:\"}),/*#__PURE__*/e(\"p\",{children:/*#__PURE__*/e(\"strong\",{children:\"Data collection\"})}),/*#__PURE__*/e(\"p\",{children:\"The obtaining of the required data is the start of the whole process. A designated team of experts in the field of ML identifies the data that will be utilized for training. By knowing what it is they want to predict, they can easily determine exactly what data might be more useful or valuable for their project.\"}),/*#__PURE__*/e(\"p\",{children:/*#__PURE__*/e(\"strong\",{children:\"Algorithm selection and data labeling process\"})}),/*#__PURE__*/e(\"p\",{children:\"Algorithm selection is among the initial choices one makes in ML. A mathematical algorithm is at the core of any model, determining how the model will find patterns in the collected data. The algorithm is the math force that stands behind the ML model. The same model can use various algorithms. An algorithm without a model is simply a set of mathematical equations.\"}),/*#__PURE__*/e(\"p\",{children:\"Simultaneously with the choice of algorithm, the collected data undergo a series of transformations to shape the training set. Data is often edited, refined, labeled, and enhanced manually to achieve acceptable data quality for future models.\"}),/*#__PURE__*/e(\"p\",{children:\"Once the data has been prepared, developers proceed to Feature Engineering. The features are the data values that the model will employ in training as well as later in real-world implementations. Specialists review the available data and generate a list of features that have great predictive power.\"}),/*#__PURE__*/e(\"p\",{children:/*#__PURE__*/e(\"strong\",{children:\"Model training\"})}),/*#__PURE__*/e(\"p\",{children:\"Training is a fundamental component of the overall procedure. Experts upload training sets to the ML model to train it to make predictions based on new data.\"}),/*#__PURE__*/e(\"p\",{children:/*#__PURE__*/e(\"strong\",{children:\"Testing\"})}),/*#__PURE__*/e(\"p\",{children:\"It is essential to note the difference in terms such as training set and test set. The first is utilized to train the model, and the second one is employed to test it. Trained models are evaluated through the test data to make sure that predictions are highly accurate.\"}),/*#__PURE__*/e(\"p\",{children:\"After comparing these test results, the model may then be adjusted, modified, or trained again on some other data. Training and evaluation continue until the model achieves an appropriate rate of correct predictions.\"}),/*#__PURE__*/e(\"p\",{children:\"Data labeling work is often also involved at this step, because testing involves comparing the model's predictions to pre-labeled reliable data. Without constant quality control, a trained model might \\\"drift\\\" from the correct path and start providing unreliable results with time. A data scientist usually keeps an eye on the network's results and runs data labeling tasks to prepare control datasets.\"}),/*#__PURE__*/e(\"p\",{children:/*#__PURE__*/e(\"strong\",{children:\"Practical application\"})}),/*#__PURE__*/e(\"p\",{children:\"The final phase is the actual practical use of the ML model. Essentially, it involves the end user employing the model to generate predictions generated from real data, which is similar to the training dataset in terms of its content but does not contain labels.\"}),/*#__PURE__*/e(\"p\",{children:'Let\\'s say a development team wants to build an application to identify rotten tomatoes on a conveyor belt. They can train a model on a set of images of rotten and fresh tomatoes, each marked with a \"rotten\" or \"fresh\" label. Then they implement that model in an application to recognize these fruits. Specialists have to create a model that identifies which tomato is rotten and which is fresh. In other words, after training, the computer system itself should be able to assign a label to each tomato it analyzes.'}),/*#__PURE__*/e(\"h2\",{children:\"Labeling data in machine learning\"}),/*#__PURE__*/e(\"p\",{children:\"What is a label in machine learning? A label is a description that informs an ML model what a particular data represents so that it may learn from the example.\"}),/*#__PURE__*/e(\"p\",{children:\"Data labeling, also sometimes called data annotation, is the process of adding labels to raw data to show the ML model the desired responses that it should be able to forecast. The data labeling procedure is a critical step performed by managed data labeling teams. A label represents the ground truth that your output data is compared to.\"}),/*#__PURE__*/e(\"p\",{children:\"Once experts have created a successful ML model, the computer should be able to figure out the labels on its own. Thus, we can say that labels are an output you get from your model after training it.\"}),/*#__PURE__*/e(\"h2\",{children:\"Features in machine learning\"}),/*#__PURE__*/e(\"p\",{children:\"A feature in machine learning refers to an individual measurable characteristic or property of an object that is being observed. It is one of the most common input methods in machine learning. The choice of meaningful, distinguishable, and independent features is a fundamental component of building efficient ML algorithms.\"}),/*#__PURE__*/e(\"p\",{children:\"To put it simply, machine learning features describe the characteristics of your training data. In training datasets represented as tables, the features are found in columns. Each column stands for a specific characteristic or property.\"}),/*#__PURE__*/e(\"p\",{children:\"For example, in image labeling features usually include patterns, colors, and shapes that are present in images, something like fur, feathers, or, a lower-level version such as pixel values.\"}),/*#__PURE__*/e(\"p\",{children:\"Features also allow you to distinguish one object from another in a picture. For example, if there is fur present in a photo, then it is probably a dog, and if there are feathers, then it is more likely a bird. Although, for the model to accurately identify the object in the photo, you will definitely need more features.\"}),/*#__PURE__*/e(\"p\",{children:\"Frequently, datasets that specialists have to work with comprise a large number of features, the amount of which may reach several hundred or even thousands. It is not always obvious when building an ML model which of the features are really relevant to it, and which are unnecessary. To put it another way, the initial set of features may be too large to be processed. Thus, a preliminary step in many machine learning applications consists of feature selection or the extraction of a new downsized feature set.\"}),/*#__PURE__*/e(\"p\",{children:\"Feature selection refers to the procedure of selecting a subset of features from the original features so that the feature set is optimized and reduced according to a certain requirement. Feature extraction (sometimes referred to as construction) is a procedure that generates a set of new features.\"}),/*#__PURE__*/e(\"h2\",{children:\"ML label vs feature\"}),/*#__PURE__*/e(\"p\",{children:\"At the first glance labels and features in machine learning may seem like they describe very similar, if not identical, concepts, but this is far from the truth. As discussed above, a label represents an output value, while a feature is an input value that describes the characteristics of such labels in datasets.\"}),/*#__PURE__*/e(\"p\",{children:\"For example, you have a completed ML model, which has been pre-trained with datasets of dog breeds, in which their characteristics (features) and corresponding breeds (labels) have been specified. Now the algorithm built into this model should be able to determine the type of dog (label) by the traits (features) you provide to it.\"}),/*#__PURE__*/e(\"p\",{children:'When ML models are used in real life, experts present input data with features to the computer. For example, in a factory, a conveyor belt moves a lot of tomatoes. The ML model captures the features of tomatoes by employing computer vision tools. If the model recognizes such tomato features as its color which is brown or black the label \"rotten\" is assigned to the fruit, if the color is red, yellow, or orange it means that the tomato is fresh. It is worth mentioning that this is a simplified approximated description of the process.'}),/*#__PURE__*/e(\"p\",{children:\"The features have to be uploaded to the computer system as input so that the ML algorithms can generate a label as an output. This is the difference between ML labels and features. The more and better the labeled training dataset was, the better the model will predict labels by features.\"}),/*#__PURE__*/e(\"h2\",{children:\"Targets in ML and their differences from labels and features\"}),/*#__PURE__*/e(\"p\",{children:\"A couple of terms that are often interchangeable in ML are target vs label. A target is a dataset variable to be predicted by an ML model. This is the variable that describes the outcome of the process. Broadly speaking, the terms label and target may be used interchangeably. However, the label is more common within classification algorithms than within regression ones. Target is the final output an ML model is trying to predict. It can be categorical or continuous.\"}),/*#__PURE__*/e(\"p\",{children:\"The label represents the true outcome of the target. As mentioned earlier the labels are assigned to the training dataset but when the ML model is ready it is fed with unlabeled data. The label has a known or correct value, but finding the target variable is the main task of the ML model. As the model is trained, experts try to train the model in the best and highest quality way possible so that the targets are closest to the labels. However, predictions on the target variables will most likely never be one hundred percent perfectly correct but will strive toward that goal as they are trained and errors are corrected.\"}),/*#__PURE__*/e(\"p\",{children:\"As for the difference between features and targets in ML, this should be obvious by now. As we have already found out, the target is often referred to as a synonym for a label, since there is essentially not much difference between the two terms. The label has a precisely known value, while the target is the variable the model is trying to predict. Under ideal conditions, label and target would be the same thing. The features describe the label, so they should also describe the target. If the output of the ML model indicates that the target and features do not match, then there must have been an error somewhere in the ML model that needs to be fixed for it to give correct predictions.\"}),/*#__PURE__*/e(\"h2\",{children:\"Types and algorithms of machine learning\"}),/*#__PURE__*/e(\"p\",{children:\"There are quite a number of ML methods, although among them we may distinguish 4 key types:\"}),/*#__PURE__*/t(\"ul\",{children:[/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"classic learning\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"ensemble learning\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"reinforcement learning\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"neural networks and deep learning.\"})})]}),/*#__PURE__*/e(\"h3\",{children:\"Classic learning\"}),/*#__PURE__*/e(\"p\",{children:\"The earliest and most basic ML methods fall into two categories - Unsupervised and Supervised Learning.\"}),/*#__PURE__*/e(\"h3\",{children:\"Unsupervised learning\"}),/*#__PURE__*/e(\"p\",{children:\"This type of ML is not as frequently used as supervised learning in real situations. In unsupervised learning, algorithms do not require a training data set. That is, the data labeling is not necessary, the machine does not need a human to guide it, and it tries to find any patterns in the provided data on its own. For example, experts merely show a machine a large number of pictures of objects and tell it to figure out the similarities between them. This type of learning includes the following kinds of tasks:\"}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"strong\",{children:\"Clustering\"}),\". The clustering involves arranging objects into relatively homogeneous groups. It separates data based on characteristics that seem similar to the machine. Clustering is applied for text analysis, customer segmentation for marketing purposes, and even for preventing fraud.\"]}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"strong\",{children:\"Association\"}),\". This type is used to find sets of characteristics, and their values, which are frequently encountered in the feature descriptions of objects. Another name for this approach is called rule-finding - a machine analyzes a dataset and finds features that occur together frequently.\"]}),/*#__PURE__*/e(\"p\",{children:\"For example, in e-commerce, by analyzing shopping carts, the computer finds items that are often bought together. This way it draws a conclusion that one product may be recommended as a match for another, and suggests a favorable choice of goods.\"}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"strong\",{children:\"Dimensionality reduction\"}),\". This type of ML algorithm gathers specific attributes into a higher-level set of abstractions, combining multiple features into more general or abstract classes. If you have objects with similar features, the algorithm combines them into one category.\"]}),/*#__PURE__*/e(\"h3\",{children:\"Supervised learning\"}),/*#__PURE__*/e(\"p\",{children:\"Here the computers have a kind of teacher telling them how to do things right. This teacher helps the machine to understand that in a certain picture there is a house, and in another one there's a car. In other words, all the data has already been pre-labeled by the \u201Cteacher\u201D, and they show the machine where a house and a car are located in a photo. The computer learns from these specific examples. This method consists of training a machine learning model using labeled data, and thus requires proper data labeling. This method is used much more often than the unsupervised one. A machine will learn faster and more accurately with a human instructor than it would with unlabeled data.\"}),/*#__PURE__*/e(\"p\",{children:\"Supervised learning assumes that the anticipated answer to the given problem is unknown for new data, although it is already identified in the training dataset. In other words, data labeling is done to provide the right responses, and the challenge for the algorithm is to find them in the new data. Supervised learning is divided into two types:\"}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"strong\",{children:\"Classification\"}),\". Such an algorithm makes predictions of object categories or separates the data according to a certain feature. It replies to a yes/no question, or it chooses one of the two. For example, it decides whether the tomatoes are good or bad, whether cows or goats shown in the photo, and so on. It is also possible to perform multi-class classification, the task of which is to categorize the data into more than two classes. These are the ways computers filter email spam, divide articles by topic and language in search engines, or match users with music that is similar to their preferences.\"]}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"strong\",{children:\"Regression\"}),\" is essentially just like classification, except using numeric characteristics. The machine finds the dependence of some numeral data on another and then is able to make predictions. Thus the software can predict the trend of a company's stock by demand for goods or the price of a car depending on its mileage and so on.\"]}),/*#__PURE__*/e(\"h3\",{children:\"Ensemble learning\"}),/*#__PURE__*/e(\"p\",{children:\"In ensemble learning, multiple models are trained to resolve a single problem and merged to obtain the best results. When it comes to ensembles, the concept of a weak learner is introduced meaning conventional models like linear regression or decision trees. A set of weak learners serve as structural units for more complex models. The combination of weak learners in order to improve the performance of a model is called a strong learner.\"}),/*#__PURE__*/e(\"p\",{children:\"Algorithms are trained simultaneously and may correct each other's mistakes. The underlying assumption is that the results of multiple models will be more accurate than the ones from just one model. For instance, ensembles are employed to identify peoples' faces and objects in the smartphone camera. The most popular ensemble methods are:\"}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"strong\",{children:\"Boosting\"}),\". This method involves consecutive algorithms training. Several similar models are trained sequentially in this method, fixing each other's errors. After processing one set of data, the machine is given the next set. The next set of data contains additional results different from the desired ones, and the algorithm then tries to find a solution.\"]}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"strong\",{children:\"Bagging\"}),\". The basic idea of bagging is to train several identical models on different data samples. In this case, homogeneous models are trained on different data sets and then combined. As a result, a prediction is obtained by averaging several predictions made by different models.\"]}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"strong\",{children:\"Stacking\"}),\". This method may use algorithms of various types, not just from a certain family. A meta-model exists that receives basic models as input and the output is a final prediction. Various weak learners are combined in parallel so that by combining them with Meta learners, more accurate future predictions can be obtained. The goal of the method is to train weak learners, but in practice, the accuracy is still low and the approach is rarely used.\"]}),/*#__PURE__*/e(\"h3\",{children:\"Reinforcement learning\"}),/*#__PURE__*/e(\"p\",{children:\"A method of machine learning where the system learns by interacting with some environment. Reinforcement learning involves ML done by trial and error. There is an agent who interacts with the environment by taking actions. The environment provides an agent with a reward for those actions, and the agent keeps performing them. The reward for a successfully completed task is the opportunity to take on a new task, and the points gained in the process of solving it.\"}),/*#__PURE__*/e(\"p\",{children:\"The more efficiently the objective is accomplished, the more points are awarded. Initially, situations are designed in a virtual environment, and after that artificial intelligence continues to explore and learn in the real world. The more efficiently the objective is accomplished, the more points are awarded. Initially, situations are designed in a virtual environment, after that artificial intelligence continues to explore and learn in the real world. The algorithm is employed to train artificial intelligence in the gaming industry, robot vacuum cleaners, and autopilot self-driving vehicles.\"}),/*#__PURE__*/e(\"h3\",{children:\"Neural networks\"}),/*#__PURE__*/e(\"p\",{children:\"Neural networks represent mathematical models and their software implementations, which are derived from the structure of the human nervous system. The central distinctive characteristic of a network is the capacity to learn. These networks are composed of several layers:\"}),/*#__PURE__*/t(\"ul\",{children:[/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"The input layer receives the data set.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Hidden layers perform calculations based on the input parameters.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"The output layer produces the result of the computations.\"})})]}),/*#__PURE__*/e(\"p\",{children:'Each of them has several nodes (neurons), which are interconnected with other nodes in the network through various links and have their own \"weight\" that affects the strength of the transmitted signal. Such design enables simultaneous processing of data and allows for constant comparative analysis of the processing results at each of the stages.'}),/*#__PURE__*/e(\"p\",{children:\"In recent years, higher performance capabilities of computers allowed these networks to perform more and more complex and interesting tasks. The processing capacity of the system is essential since each neuron is constantly performing computationally demanding calculations. A complicated task usually requires a huge amount of neurons and a lot of mathematical calculations. Understandably, this would imply the need for a very powerful machine.\"}),/*#__PURE__*/e(\"p\",{children:\"To put it simply, the connections between neurons are implemented like this. One of the neurons sends some calculations to another neuron, which receives and processes this information and then passes the result of its own calculations to another neuron. Thus, information spreads throughout the network, and the learning process occurs.\"}),/*#__PURE__*/e(\"p\",{children:\"When specialists train a neural network, they introduce it to data that should be used to predict something, as well as a set of correct responses for such data. As it was previously mentioned for other algorithms, such correct responses are called a training dataset. Of course, all above is a huge simplification: actual neural networks are more intricate than that. A lot of information is required to train an AI properly.\"}),/*#__PURE__*/e(\"h3\",{children:\"Deep learning\"}),/*#__PURE__*/e(\"p\",{children:\"As recently as 10 years ago the evolution of neural networks was suspended for the lack of processing power of computers. Once this obstacle was gone, deep learning was introduced, bringing data processing to a completely new level.\"}),/*#__PURE__*/e(\"p\",{children:\"Deep learning describes a type of ML that employs multi-layered networks that acquire knowledge by learning on massive datasets. Deep learning artificial intelligence finds the algorithm for solving the initial task on its own, learning from its mistakes and giving a more accurate result after each training session.\"}),/*#__PURE__*/e(\"p\",{children:'The networks are split into layers - neuron structures with a shared objective. Deep learning employs a higher number of hidden layers. Such models are called deep learning networks. Each calculation is considered to be a layer, so complex deep learning networks have numerous layers, thus the name \"deep\" networks. The more complicated a neural network is, that is, the more layers and neurons it possesses, and the more computational operations it performs, the better results it tends to produce.'}),/*#__PURE__*/e(\"h2\",{children:\"Conclusion\"}),/*#__PURE__*/e(\"p\",{children:\"Machine learning is an intriguing field of knowledge that requires a thorough and time-consuming exploration. The current trend is that ML algorithms are likely to be applied even more extensively in the near future, and knowledge of the topic will simply be a necessity. Nowadays, this field of AI is already firmly embedded in our daily lives and greatly simplifies it not only on a day-to-day basis but also in the workplace. It is necessary to spend a lot of effort and time studying this topic and then all the incredible possibilities of machine learning will open its doors to you.\"}),/*#__PURE__*/e(\"h2\",{children:\"About Toloka\"}),/*#__PURE__*/e(\"p\",{children:/*#__PURE__*/e(\"em\",{children:\"Toloka is a European company based in Amsterdam, the Netherlands that provides data for Generative AI development. Toloka empowers businesses to build high quality, safe, and responsible AI. We are the trusted data partner for all stages of AI development from training to evaluation. Toloka has over a decade of experience supporting clients with its unique methodology and optimal combination of machine learning technology and human expertise, offering the highest quality and scalability in the market.\"})})]});export const richText2=/*#__PURE__*/t(n.Fragment,{children:[/*#__PURE__*/e(\"p\",{children:\"There's a lot of excitement when it comes to developments in AI and image recognition technology. The ability of machines to interpret, analyze, and assign meaning to images is a key area of interest and innovation.\"}),/*#__PURE__*/e(\"p\",{children:\"Companies across industries are rapidly adopting image recognition technologies for a wide variety of purposes. A huge part of this progress has become possible due to the ever-increasing number of digital photos and videos uploaded online by people all over the world. It helps make visual data processing and analysis capabilities faster, more accurate, and more efficient.\"}),/*#__PURE__*/e(\"p\",{children:\"Healthcare, marketing, transportation, and e-commerce are just a few of the many applications of today's applications of this technology. Emerging technologies like augmented reality, virtual reality, and computer vision applications are all based on AI image recognition. It's even been prominently featured in Hollywood blockbusters \u2014 from the 1980's classic Robocop to Blade Runner.\"}),/*#__PURE__*/e(\"p\",{children:\"If you're looking to learn more about artificial intelligence and image recognition, you've come to the right place. In this post, we explore using AI in more detail based on insightful image recognition examples to address the following questions: What is image recognition in AI? And how does it work?\"}),/*#__PURE__*/e(\"p\",{children:\"To find out where we're going, it's important to understand where we've been \u2014 and how this technology has developed into what it is today, along with its potential future uses. As we dive into key terms, current uses, and future applications, we also take a closer look at the evolution of this rapidly growing technology to date.\"}),/*#__PURE__*/e(\"h2\",{children:\"Key terms of the image recognition technology\"}),/*#__PURE__*/e(\"p\",{children:\"First, let's start off by defining some key terms so that we can better understand how they're related to one another and how they contribute to the development of AI as a whole.\"}),/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Image recognition\"}),' involves the creation of a neural network that processes the individual pixels of an image. In other words, it\\'s a type of AI programming that can \"understand\" the content of an image by analyzing and interpreting pixel patterns. Researchers feed these networks with as many pre-labeled images as possible to \"teach\" them how to recognize similar images.']}),/*#__PURE__*/t(\"p\",{children:[\"Image recognition is a subcategory of \",/*#__PURE__*/e(\"em\",{children:\"computer vision\"}),', which is an all-encompassing descriptor for the process of training computers to \"see\" like humans and take action. Even without realizing it, we frequently engage in mundane interactions with computer vision technologies like facial recognition. ',/*#__PURE__*/e(\"em\",{children:\"Image processing\"}),\" is a sweeping term for using machine learning algorithms to analyze digital images.\"]}),/*#__PURE__*/e(\"p\",{children:\"Any AI system that processes visual information generally relies on computer vision \u2014 and those systems that can identify certain objects or categorize images based on their content are performing AI image recognition. This is critical for machines that need to recognize and categorize different objects around them accurately and efficiently. For example, driverless cars that use computer vision to identify pedestrians, traffic signs, and other vehicles in the vicinity.\"}),/*#__PURE__*/t(\"p\",{children:[\"A related term, \",/*#__PURE__*/e(\"em\",{children:\"pattern recognition\"}),\", is a broader concept compared to computer vision which focuses on image recognition. However, image recognition can be described as a common application of pattern recognition where a computer vision system is trained to recognize patterns in images, and then identify images that contain those patterns. Valuable use cases include identifying faces in photos, recognizing and classifying objects, finding landmarks, and detecting body poses or keypoints.\"]}),/*#__PURE__*/e(\"p\",{children:\"Data inputs for pattern recognition can be words or texts, images, or audio files. As a compilation of loosely related areas and techniques, pattern recognition analyzes incoming data and tries to identify patterns. It's an absolute must for intelligent systems such as CAD systems in medicine. Other techniques include speech recognition, text classification, and automatic recognition of images of human faces or handwriting.\"}),/*#__PURE__*/t(\"p\",{children:[\"Image recognition is sometimes confused with \",/*#__PURE__*/e(\"em\",{children:\"image detection\"}),\", which involves taking an image as input and finding various objects within it \u2014 think face detection, where algorithms aim to find facial expressions and patterns in images. While image detection aims to distinguish one object from another to identify how many separate entities there are within an image, image recognition focuses on identifying the objects of interest within an image and recognizing which category or class they belong to.\"]}),/*#__PURE__*/e(\"h2\",{children:\"A brief history of computer vision\"}),/*#__PURE__*/t(\"p\",{children:[\"The first steps toward what would later become image recognition were taken in the late 1950s. However, computer vision as an academic discipline really took off in the 1960s at universities that were pioneering the development of AI. Early researchers recognized the potential of AI to change the world. With the goal of imitating human brain and eye sight, computer vision was brought to life through a \",/*#__PURE__*/e(a,{href:\"https://dspace.mit.edu/handle/1721.1/6125\",motionChild:!0,nodeId:\"eo4RAmtig\",openInNewTab:!1,relValues:[],scopeId:\"contentManagement\",smoothScroll:!1,children:/*#__PURE__*/e(i.a,{children:\"summer project\"})}),' in 1966 when researchers attached a camera to a computer and had it \"describe what it saw\" \u2014 kicking off a new and exciting stage of development.']}),/*#__PURE__*/e(\"p\",{children:\"What made computer vision a cutting-edge prospect at the time was the goal of extracting 3D structures from images to achieve a complete understanding of the scene. Studies from the 1970s formed the basis of many of the computer vision algorithms we use today, such as extracting edges, labeling lines, representing objects as interconnections of smaller structures, and so on. Later studies evolved to incorporate more intense mathematical and quantitative analyses \u2014 driving progress and innovation forward. These included scale-space, contour models, detecting shape based on shading, texture, focus, and more.\"}),/*#__PURE__*/e(\"p\",{children:\"The 1990s ushered in a new stage of growth including projective 3D reconstructions that led to greater awareness of camera calibration, which in turn, led to new methods for reconstructing scenes from multiple images. Variations of graph cuts were used to solve image segmentation and more. A major transition came about with the increased interaction between computer graphics and computer vision, including image-based rendering, image morphing, panoramic image stitching, and light-field rendering. No doubt you've heard of some of these terms already.\"}),/*#__PURE__*/e(\"p\",{children:\"Present day examples of research and innovation include the advancement of deep learning techniques that have propelled computer vision to a new level \u2014 increasing the accuracy of algorithms on data sets for image classification, image segmentation, optical flow, and more.\"}),/*#__PURE__*/e(\"h2\",{children:\"AI image recognition of today and future applications\"}),/*#__PURE__*/e(\"p\",{children:\"Image recognition today is carried out in a variety of ways, but most methods involve the use of supervised learning, neural networks, and deep learning algorithms. Convolutional neural networks help ML-based systems improve their ability to identify an image's subject.\"}),/*#__PURE__*/e(\"p\",{children:\"One of the most promising areas of research and development is on new and emerging technologies that have the potential to revolutionize many industries and improve the quality of life for people everywhere \u2014 from healthcare, including more precise diagnoses of diseases, to finance, through fraud detection based on image analyses of banknotes.\"}),/*#__PURE__*/e(\"p\",{children:\"Below are several examples of future applications of this technology:\"}),/*#__PURE__*/t(\"ul\",{style:{\"--framer-font-size\":\"18px\",\"--framer-text-alignment\":\"start\",\"--framer-text-color\":\"rgb(30, 33, 38)\",\"--framer-text-stroke-width\":\"0px\",\"--framer-text-transform\":\"none\"},children:[/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Refining augmented reality\"}),\" \u2014 Primarily in software and game development for more realistic experiences. The gaming industry has been making significant strides in this area, but image recognition software development is not just limited to this one industry. It also serves as the foundation for applications in advertising, which use augmented reality.\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Enhancing medical imagery\"}),\" \u2014 Images make up the primary source of data for the healthcare industry. Smart picture recognition systems will be able to train these medical photos while improving diagnoses and early detection practices.\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Empowering educators\"}),\" \u2014 Image recognition algorithms make it possible for students with learning disabilities to record their knowledge. For example, text-to-speech and vision-based programs.\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Advancing self-driving cars\"}),\" \u2014 An exciting development in this area is that researchers are close to creating AI that would enable cars to see in the dark thanks to the image recognition algorithm.\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Teaching machines to see\"}),\" \u2014 Teaching machines to recognize visuals, analyze them, and make decisions based on visual input has tremendous potential for production around the world, as seen in industrial and manufacturing processes already.\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Improving facial recognition\"}),\" \u2014 Facial recognition technology is frequently used for biometric identification, where a person's identity is verified by scanning their facial features. The advancement of image recognition techniques is bringing about new possibilities for facial recognition use across industries with improved accuracy and novel applications.\"]})})]}),/*#__PURE__*/e(\"p\",{children:\"Of course, apart from these, there are many other advances and future applications for AI. The possibilities are truly limitless.\"}),/*#__PURE__*/e(\"h2\",{children:\"AI image recognition technology\"}),/*#__PURE__*/e(\"p\",{children:\"As mentioned, AI-based technologies have grown in significance across industries such as healthcare, retail, security, agriculture, and more.\"}),/*#__PURE__*/e(\"p\",{children:\"Below we explore some common applications of these technologies:\"}),/*#__PURE__*/e(\"h3\",{children:\"Facial recognition and analysis\"}),/*#__PURE__*/e(\"p\",{children:\"This includes facial identification, recognition, and verification using cameras or webcams. With the help of AI algorithms, the latest software has made countless applications possible including face detection, alignment, and pose estimation, as well as gender recognition, smile detection, and age estimation using deep convolutional neural networks.\"}),/*#__PURE__*/e(\"h3\",{children:\"Medical image analysis\"}),/*#__PURE__*/e(\"p\",{children:\"Visual recognition technology helps computers to understand visual data that is routinely acquired throughout the course of a patient's treatment; for example, detecting a bone fracture.\"}),/*#__PURE__*/e(\"h3\",{children:\"Animal monitoring\"}),/*#__PURE__*/e(\"p\",{children:\"In agriculture and farming, AI image recognition algorithms are used to observe animals and other livestock for diseases, anomalies, as well as for compliance with animal welfare standards, industrial automation, and more.\"}),/*#__PURE__*/e(\"h3\",{children:\"Pattern and object detection\"}),/*#__PURE__*/e(\"p\",{children:\"AI photo and video recognition technologies can be used to identify objects, people, patterns, logos, places, colors, and shapes. And the image recognition aspect of these technologies can be customized across software. For example, if a model is programed to detect people in a video frame, it can then be applied to people counting as used in retail.\"}),/*#__PURE__*/e(\"h3\",{children:\"Image-based plant identification\"}),/*#__PURE__*/e(\"p\",{children:\"Used widely in research, nature management, and sustainability efforts, image recognition systems can also help identify plant species, monitor for diseases, and track growth cycles. Likewise, such systems can map crop quality.\"}),/*#__PURE__*/e(\"h3\",{children:\"Food recognition\"}),/*#__PURE__*/e(\"p\",{children:\"As seen in computer-aided dietary assessments, image recognition works to improve the accuracy of dietary intake measurements by analyzing food images taken on digital devices and shared online.\"}),/*#__PURE__*/e(\"h3\",{children:\"Image search\"}),/*#__PURE__*/e(\"p\",{children:\"Also referred to as visual search, image search uses visual features learned from a deep neural network to develop organized and scalable ways for retrieving images with the broader aim of carrying out content-based image retrieval.\"}),/*#__PURE__*/e(\"h3\",{children:\"Production-line quality assurance\"}),/*#__PURE__*/e(\"p\",{children:\"Applied primarily in the production and manufacturing sector for testing and inspections, an image recognition system can also be used for quality assurance by helping to detect product defects or flaws.\"}),/*#__PURE__*/e(\"h3\",{children:\"Automobile manufacturing\"}),/*#__PURE__*/e(\"p\",{children:\"Think autonomous vehicles. Image recognition plays a significant role in how successfully self-driving cars can navigate their environment without a person sitting behind the wheel. Perfecting this technology would be a breakthrough in the way we drive.\"}),/*#__PURE__*/e(\"h3\",{children:\"Security and surveillance\"}),/*#__PURE__*/e(\"p\",{children:\"The technology can be used to train a computer to identify people or objects based on their appearance, while giving security personnel a break from having to monitor multiple displays at once.\"}),/*#__PURE__*/e(\"h3\",{children:\"Automation of administrative processes\"}),/*#__PURE__*/e(\"p\",{children:\"Paying bills, scheduling appointments, collecting data and any other type of repetitive or monotonous task has the potential to be automated with the help of several AI methods including image recognition systems.\"}),/*#__PURE__*/e(\"h3\",{children:\"Asset management and project monitoring\"}),/*#__PURE__*/e(\"p\",{children:\"In energy, construction, rail, or shipping, for example. Defects such as rust, missing bolts and nuts, damage or objects that do not belong where they are can be identified with the help of object detection and object recognition.\"}),/*#__PURE__*/e(\"p\",{children:\"These are just some of the many applications, but there are countless other ways in which this cutting-edge technology can be put to good use.\"}),/*#__PURE__*/e(\"h2\",{children:\"Which image recognition model to choose?\"}),/*#__PURE__*/e(\"p\",{children:\"Pretrained image recognition models that are based on Convolutional Neural Networks (CNN) are at the center of AI image recognition technology. Another key element of image recognition is having the right training data, which must be collected, annotated, and fed into these models to retrain and fine-tune them for specific downstream applications. Accuracy is the main benchmark for evaluating image recognition tools. Factors like speed and adaptability are usually considered at a later point.\"}),/*#__PURE__*/e(\"p\",{children:\"Common CNN-based pretrained models for image recognition work include:\"}),/*#__PURE__*/t(\"ul\",{style:{\"--framer-font-size\":\"18px\",\"--framer-text-alignment\":\"start\",\"--framer-text-color\":\"rgb(30, 33, 38)\",\"--framer-text-stroke-width\":\"0px\",\"--framer-text-transform\":\"none\"},children:[/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Faster R-CNN (Region-based CNN)\"}),\" \u2014 A two-stage pretrained model that uses a CNN to produce candidate object regions, which are then passed through a separate CNN to classify images and refine bounding boxes. Accuracy is the key benefit, but it can take a long time to retrain.\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"You Only Look Once (YOLO)\"}),\" \u2014 A one-stage model that uses a CNN to predict class labels and bounding boxes of objects in an image. The main advantage is fast inference time (or quick delivery) and low memory usage. On the downside, it's less accurate compared to Faster R-CNN.\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Single Shot MultiBox Detector (SSD)\"}),\" \u2014 A one-stage model that uses a single CNN to predict bounding boxes and class labels of objects in an image. It has a good balance between accuracy and performance speed.\"]})})]}),/*#__PURE__*/t(\"p\",{children:[\"In terms of model evaluation, deployment, and monitoring, human annotators play a key role in gaging the performance of AI-assisted image recognition solutions when faced with new, previously unseen data. That's where \",/*#__PURE__*/e(a,{href:\"https://toloka.ai/data-labeling-platform/\",motionChild:!0,nodeId:\"eo4RAmtig\",openInNewTab:!1,relValues:[],scopeId:\"contentManagement\",smoothScroll:!1,children:/*#__PURE__*/e(i.a,{children:\"Toloka's crowd contributors\"})}),\" come into play.\"]}),/*#__PURE__*/e(\"h2\",{children:\"Data markup and image annotation on Toloka\"}),/*#__PURE__*/e(\"p\",{children:\"When it comes to training data labeling for AI-assisted image recognition applications, crowdsourcing helps to distribute image annotation tasks among hundreds or thousands of annotators \u2014 quickly, efficiently and at a low cost.\"}),/*#__PURE__*/e(\"p\",{children:\"Toloka's crowd contributors can complete the following image labeling tasks:\"}),/*#__PURE__*/e(\"h3\",{children:\"Image segmentation\"}),/*#__PURE__*/e(\"p\",{children:\"This involves object recognition and drawing pixel-wise boundaries for each object or group of objects.\"}),/*#__PURE__*/e(\"h3\",{children:\"Bounding box\"}),/*#__PURE__*/e(\"p\",{children:\"Identifying objects in images that match certain classes and using bounding boxes to mark the location.\"}),/*#__PURE__*/e(\"h3\",{children:\"Polygon\"}),/*#__PURE__*/e(\"p\",{children:\"Identifying objects in images that match certain classes and drawing pixel-perfect polygons around the exact shape.\"}),/*#__PURE__*/e(\"h3\",{children:\"Keypoint\"}),/*#__PURE__*/e(\"p\",{children:\"Labeling feature details in human faces to identify facial landmarks, expressions, or emotions.\"}),/*#__PURE__*/e(\"h3\",{children:\"Image classification\"}),/*#__PURE__*/e(\"p\",{children:\"Image classification is done by matching visual content with one or more predefined categories.\"}),/*#__PURE__*/e(\"h3\",{children:\"Image transcription\"}),/*#__PURE__*/e(\"p\",{children:\"Transcribing text in PDF files and using labeled data to train text recognition algorithms or validate and fine-tune the output of OCR models.\"}),/*#__PURE__*/e(\"h3\",{children:\"Side-by-side comparison\"}),/*#__PURE__*/e(\"p\",{children:\"Using side-by-side image comparisons to verify or clean up data, looking at two images and picking the one that's better.\"}),/*#__PURE__*/e(\"h3\",{children:\"Image and video collection\"}),/*#__PURE__*/e(\"p\",{children:\"Collecting datasets of videos or images related to a common theme, or with a specific time of lighting or environment.\"}),/*#__PURE__*/t(\"p\",{children:[\"Visit our \",/*#__PURE__*/e(a,{href:\"https://toloka.ai/blog/\",motionChild:!0,nodeId:\"eo4RAmtig\",openInNewTab:!1,relValues:[],scopeId:\"contentManagement\",smoothScroll:!1,children:/*#__PURE__*/e(i.a,{children:\"blog\"})}),\" to learn more about the benefits of crowdsourcing and to discover what other types of data labeling tasks Tolokers are involved in when it comes to the wider machine learning pipeline.\"]}),/*#__PURE__*/e(\"h2\",{children:\"Key takeaways\"}),/*#__PURE__*/e(\"p\",{children:\"As a recap, image recognition essentially means identifying objects within an image and categorizing the image correspondingly. Image, photo, and picture recognition are all basically the same thing. In this article, we've defined image recognition as an application of AI and how it relates to computer vision, while covering everything from the origins of this technology to future scenarios and opportunities.\"}),/*#__PURE__*/e(\"p\",{children:\"Obviously, image recognition seems like a simple task to us humans. If you look at an object or scene in an image, you can automatically make distinctions between subjects and identify what you see. For a machine, however, this is highly complex, which makes AI image recognition a long-standing research topic in the field of computer vision.\"}),/*#__PURE__*/e(\"p\",{children:\"As different methods to simulate human vision have evolved over time, the main idea behind image recognition has stayed the same: the ability of machines to classify objects into different categories \u2014 and determine the category to which an image belongs.\"}),/*#__PURE__*/e(\"p\",{children:\"Image recognition combined with deep learning is a key application of today's AI vision and is used to power a wide range of real-world use cases. Recent advances have led to great results across computer vision and image recognition tasks. And no doubt that progress will continue for years to come.\"})]});export const richText3=/*#__PURE__*/t(n.Fragment,{children:[/*#__PURE__*/e(\"p\",{children:\"In this post, we'll cover the data labeling process in the context of the ML pipeline, from data collection to model training and monitoring. You'll learn what options there are for data-driven ML: from using ready-made datasets to collecting your own data with an in-house team or data labeling outsourcing.\"}),/*#__PURE__*/e(\"h2\",{children:\"Why do we need labeled data?\"}),/*#__PURE__*/e(\"p\",{children:'Creating an artificial intelligence-powered product is no easy feat \u2013 the road from inception to unveiling it to the public is paved with numerous challenges. Specialists from a multitude of fields bring their expertise onboard to ensure that a downstream application can satisfy the end user and can therefore successfully compete in the marketplace. A regular AI product pipeline (i.e., the so-called \"ML value chain\") can be roughly divided into the model- and data-oriented parts. It normally looks something like this:'}),/*#__PURE__*/e(\"img\",{alt:\"Why do we need labeled data?\",className:\"framer-image\",height:\"700\",src:\"https://framerusercontent.com/images/BYUNz5mjvlyn34kCBx1RugyVhw0.jpeg\",srcSet:\"https://framerusercontent.com/images/BYUNz5mjvlyn34kCBx1RugyVhw0.jpeg?scale-down-to=512 512w,https://framerusercontent.com/images/BYUNz5mjvlyn34kCBx1RugyVhw0.jpeg?scale-down-to=1024 1024w,https://framerusercontent.com/images/BYUNz5mjvlyn34kCBx1RugyVhw0.jpeg?scale-down-to=2048 2048w,https://framerusercontent.com/images/BYUNz5mjvlyn34kCBx1RugyVhw0.jpeg 2560w\",style:{aspectRatio:\"2560 / 1400\"},width:\"1280\"}),/*#__PURE__*/e(\"p\",{children:'Selecting the right model and \"fine-tuning it\" (i.e., improving it) is crucial to achieving success when you\\'re preparing a downstream application (i.e., an AI product that serves a particular purpose). We\\'ve talked about \"foundation models\" that serve as the basis for model fine-tuning in our previous posts here (Natural Language Processing / NLP) and here (Computer Vision / CV). But it\\'s also important to remember not to adopt the \"model-centric approach\" exclusively by disregarding data work, because no matter how good an machine learning model is (i.e., a training model acting as instructions for AI), it always needs training data to function, which is why some refer to it as \"food for AI.\"'}),/*#__PURE__*/e(\"p\",{children:'In fact, one could argue that good data is even more important than a machine learning model, because the pipeline displayed above consists of three stages that focus on data-related work and an additional three stages that focus on data-related work alongside model-related work. So, in effect all six stages of the ML value chain require high-quality data without exception. Those who understand this organize their business processes accordingly and ultimately put out a successful AI product free of biases and ethical issues. This is referred to as adopting the \"data-centric approach\" to AI, which we at Toloka consider the right way to move forward.'}),/*#__PURE__*/e(\"h2\",{children:\"Data collection\"}),/*#__PURE__*/e(\"p\",{children:\"Now that we've decided to proceed using the data-centric approach, the question is what do we do next, that is, who's doing what and how exactly? As our pipeline suggests, any machine learning project has to start with training data collection. A number of methodologies currently exist that allow AI product developers to obtain raw (i.e., unlabeled) data. Let's take a look at the pros and cons of each approach.\"}),/*#__PURE__*/e(\"h3\",{children:\"Using ready libraries and datasets\"}),/*#__PURE__*/e(\"p\",{children:\"This is a methodology that some AI product makers utilize, and it can sometimes provide acceptable datasets. The main advantage of it is simplicity. You don't have to do anything from scratch \u2013 simply take something pre-made and ready. There are, however, two major drawbacks.\"}),/*#__PURE__*/e(\"p\",{children:\"First of all, other product developers are likely to use the same data as you, which makes your data generic by definition. To rephrase that, you can't hope for an exclusive product if your data isn't exclusive. And secondly, you have to trust someone else as far as ensuring that the data is actually valid and up-to-date. If it's not, you're basically back to square one (and hopefully not when you're already too far in).\"}),/*#__PURE__*/e(\"h3\",{children:\"Crawling and scraping\"}),/*#__PURE__*/e(\"p\",{children:\"This approach is about finding and extracting useful data samples available on the web. Here, too, AI product developers take something that's already there, without creating anything new. The advantage is that with this approach, you stand a better chance of obtaining a more unique dataset compared to the first option.\"}),/*#__PURE__*/e(\"p\",{children:\"However, with this approach you also have a limited degree of control over the quality of your extracted data. And, what's worse, this data hasn't been \\\"cleared,\\\" meaning that different parts of your dataset may contain copyrighted materials and sensitive personal information, which in some cases may result in serious repercussions, including lawsuits. For this reason, data scraping is banned in some countries.\"}),/*#__PURE__*/e(\"h3\",{children:\"Synthetic data generation\"}),/*#__PURE__*/e(\"p\",{children:\"This option implies generating new data synthetically, that is creating fake data that resembles real data. The advantage is that those who choose this route can indeed get their hands on exclusive data. One of the drawbacks here is that depending on the particulars of your dataset requirements (i.e., what exactly you need and how much of it), quite a bit of computational power may be required \u2013 something not every product developer possesses.\"}),/*#__PURE__*/e(\"p\",{children:\"Another issue, and a more serious one potentially, is that this data may be substantially divorced from the real world. This might not be such an issue if you were designing an AI product, say, as a training practice. However, if you need a real AI product that's up to the challenge out there in the real world (i.e., a product that can meet the needs of end users), it must be trained on real-world data (or some data approximating it to a reasonable extent), which is seldom the case when using synthetic data generation.\"}),/*#__PURE__*/e(\"h3\",{children:\"Outsourcing to individuals or companies\"}),/*#__PURE__*/e(\"p\",{children:'This option is more about \"who\" rather than \"how.\" Outsourcing implies giving a task (in this case, data collection and labeling) away to data labeling service providers. This may be an individual or a group of individuals that as an AI developer you can put together yourself (for example, through LinkedIn). Or, it may be a ready team of individuals, in other words a company, that can accept your data collection task as a turnkey challenge (i.e., from nothing to a ready set).'}),/*#__PURE__*/e(\"p\",{children:\"How exactly your outsourced team will get you the data may differ from team to team, but they're likely to use one of the methods we mentioned earlier. The advantage is that your outsourced data labeling team will probably know more about obtaining raw data than your own team, provided you have no prior experience. You also wouldn't need to worry about this stage of the pipeline, at least in theory.\"}),/*#__PURE__*/e(\"p\",{children:\"The main disadvantage is that your company may have to incur significant expenses of using data annotation services. They are likely to be higher than some other options, while at the end of the day it'll still be your job to make sure that the quality of your new dataset is acceptable (which may or may not be so).\"}),/*#__PURE__*/e(\"h3\",{children:\"Crowdsourcing\"}),/*#__PURE__*/e(\"p\",{children:\"Crowdsourcing is a particular type of outsourcing that can be utilized throughout different stages of the ML value chain. This approach is becoming more and more popular today as more AI developers acknowledge its high time- and cost-effectiveness. Whereas regular outsourcing can be more expensive than other approaches, and it often takes more time, the price tag on crowdsourcing tasks is usually much more reasonable, and these tasks can be carried out in a matter of days or even hours.\"}),/*#__PURE__*/e(\"p\",{children:'This is possible due to what\\'s known as \"aggregation,\" which is like a digital game of tug-of-war played against the data. The logic is that rather than having fewer narrowly trained specialists tackle various data-related tasks, with crowdsourcing, a lot of people pull on the digital rope at the same time, and their efforts are put together by data managers known as \"data annotation specialists\" or Crowd Solutions Architects (CSAs). The same subtask may be completed by several people, and the end result is an amalgam of their accumulated \"best of\" efforts that serve to ensure dataset quality.'}),/*#__PURE__*/e(\"p\",{children:'In the context of data collection, \"spatial crowdsourcing\" is the most ubiquitous type, which is also known as \"field\" or \"feet-on-street\" tasks. One of the major upsides of this approach is that completely new data is generated (as opposed to generic stocks), and this data comes from the real world (as opposed to synthetic options). In this scenario, \"crowd contributors\" as they\\'re known visit places and objects of interest in person and take photos (e.g., pets, cafes, or billboards), make videos (e.g., moving traffic), record sounds (e.g., voices), or write text (e.g., descriptions of floor plans) in real time, that is \"on location.\"'}),/*#__PURE__*/e(\"h2\",{children:\"Data processing\"}),/*#__PURE__*/e(\"p\",{children:'Data processing, also known as data preparation, is a stage of the ML value chain during which collected data is prepared for labeling (not to be confused with another stage that\\'s sometimes inserted in the pipeline known as \"data preprocessing\" / \"final processing\" that occurs immediately before machine learning model training). If you\\'re interested in knowing more about data preparation, we recommend that you have a look at some other posts in this blog that address this stage in more detail. Suffice to say, data preparation involves data cleaning and removal of corrupted files, faulties, irregularities, duplicates, missing values, and other issues.'}),/*#__PURE__*/e(\"p\",{children:'It\\'s also during this stage that machine learning engineers gauge their datasets to find the right balance between bias and variance (i.e., \"the bias-variability trade-off\") in order to have a dataset that\\'s neither too specific (\"overfitting\"), nor too general (\"underfitting\"). When things tilt too much in one of the unwanted directions, techniques like data augmentation can be used to even things out. This is done to have an optimally performing model.'}),/*#__PURE__*/e(\"h2\",{children:\"Data labeling process\"}),/*#__PURE__*/e(\"p\",{children:'When an AI developer has collected enough relevant data and this data has been cleaned up and augmented if necessary, it now has to be \"annotated\" or \"labeled.\" This means the data basically has to be \"explained\" in a way that a machine will understand before an ML engineer can proceed with feeding it into the training model. Some foundation models make use of large quantities of unlabeled data; however, when an AI developer prepares a downstream application aiming to solve a particular user-oriented problem, a foundation model always has to be retrained (i.e., \"fine-tuned\") using labeled data.'}),/*#__PURE__*/e(\"p\",{children:\"Depending on what sort of downstream application is required, different types of data labeling may take place. For instance, image annotation, drawing outlines of different objects and shapes (bounding boxes or polygons), transcribing speech from audio files, video annotation, providing titles or summaries for written texts, and so on.\"}),/*#__PURE__*/e(\"p\",{children:\"There are three main types or rough categories of data labeling:\"}),/*#__PURE__*/t(\"ul\",{style:{\"--framer-font-size\":\"18px\",\"--framer-text-alignment\":\"start\",\"--framer-text-color\":\"rgb(30, 33, 38)\",\"--framer-text-stroke-width\":\"0px\",\"--framer-text-transform\":\"none\"},children:[/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Human-in-the-loop labeling, i.e., manual labeling carried out by human annotators.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Synthetic labeling, i.e., data labeling carried out by machines.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Hybrid labeling, i.e., a format that combines elements of the first two types.\"})})]}),/*#__PURE__*/e(\"p\",{children:'Both synthetic labeling and, to a lesser extent, hybrid labeling come with the same sets of advantages and shortcomings as the synthetic data generation we\\'ve already discussed. These options are somewhat less of a \"hassle,\" which is good, but at the same time, they provide less labeling accuracy and little \"real-worldness\", while demanding a lot of computational power in return.'}),/*#__PURE__*/e(\"p\",{children:\"On the other hand, human-in-the-loop annotation can be divided into two major camps: in-house labeling (done by an internal team) and outsourcing data labeling (of which crowdsourcing is arguably the most effective type), which raises a question:\"}),/*#__PURE__*/e(\"h4\",{children:\"What to do: go for in-house labeling or outsource data labeling services?\"}),/*#__PURE__*/e(\"p\",{children:\"The main advantage of the in-house option is that it entails the most amount of control and data security \u2013 you have your own annotation team that handles all aspects of data labeling in the same space that your ML engineers prep their code. Consequently, as a project manager, you have a great deal of leverage.\"}),/*#__PURE__*/e(\"p\",{children:\"However, there are also a number of major disadvantages to this approach. To start with, in-house labeling is a slow process, because you normally have a finite number of team members dedicated to this stage of the pipeline who usually have other roles within the company and who often have to learn to data-label from scratch. In contrast to crowdsourcing, there's no aggregation, which means each task component has to be tackled piecemeal rather than cumulatively.\"}),/*#__PURE__*/e(\"p\",{children:\"Another issue is that in-house labeling is expensive. One reason is that time is money, so if something is slow, it means it's costing you more as a general rule, that is, your data-labeling progress affects your product's time to market. But another reason is that your in-house team is all about \\\"do it yourself.\\\" This means supplying your team with everything they may require including annotation tools, subscriptions, and even software training if necessary.\"}),/*#__PURE__*/e(\"h2\",{children:\"Crowdsourced data labeling\"}),/*#__PURE__*/e(\"p\",{children:\"With crowdsourcing, the situation is very different. Among its drawbacks is the fact that as an AI product developer, you have less control over the data-labeling stage of your pipeline compared to the in-house route (though close supervision during crowdsourced data annotation is allowed and even encouraged by platforms like Toloka).\"}),/*#__PURE__*/e(\"p\",{children:\"On the plus side, your time to market is reduced substantially as the whole process takes a lot less time. As a result of this and the fact that you don't need to provide any training for your staff or purchase multiple software tools, the final bill for this stage of the pipeline also ends up being significantly lower.\"}),/*#__PURE__*/e(\"p\",{children:\"At the same time, AI product developers gain access to crowd contributors with specific profiles and skill sets that often cannot be found elsewhere, let alone in a regular office setting of an inhouse team based in one location. For instance, a particular dataset may require data labelers who are speakers of a relatively uncommon African language, a person who can identify parts of a motorboat engine, or someone who can take photos and label street signs in Quebec. The power of the global crowd makes it possible and few options exist that can step up to these challenges and offer the same timeframe or financial conditions.\"}),/*#__PURE__*/e(\"p\",{children:\"All praise aside, this approach requires a great deal of attention and expertise on the part of the crowdsourcing platform. To ensure that a crowdsourcing task is carried out successfully, the following steps should be taken by those who offer this type of data labeling services:\"}),/*#__PURE__*/e(\"h3\",{children:\"Task decomposition\"}),/*#__PURE__*/e(\"p\",{children:\"Larger tasks should be broken into more manageable pieces, with each one treated as a separate project task.\"}),/*#__PURE__*/e(\"h3\",{children:\"Clear instructions\"}),/*#__PURE__*/e(\"p\",{children:\"In order to avoid confusion, misunderstandings, or personal biases, detailed instructions with clear examples should be provided to data annotators. From our experience, the better the instructions, the more accurate the results.\"}),/*#__PURE__*/e(\"h3\",{children:\"Intuitive interface\"}),/*#__PURE__*/e(\"p\",{children:\"An interface that a platform uses (usually a creation of their own) should allow data annotators to carry out and submit labeling tasks in the simplest and fastest way possible. When building such an interface, UX design (user experience design) is always considered, that is, how effectively data annotators can interact with it (functionality).\"}),/*#__PURE__*/e(\"h3\",{children:\"Quality control\"}),/*#__PURE__*/e(\"p\",{children:\"Quality control tools should be configured and integrated into labeling projects to ensure high-quality results. They include mechanisms like CAPTCHA (proceed only after a deciphered word has been entered), speed monitoring (proceed when it's clear that a sufficient amount of time has elapsed), and action checking (proceed only after certain actions like clicking on a link or scrolling down have taken place, etc).\"}),/*#__PURE__*/e(\"h3\",{children:\"Flexible pricing\"}),/*#__PURE__*/e(\"p\",{children:'The best possible price should be worked out that reflects a fair compromise between how much crowd contributors would like to be paid for a task and how much an AI application developer (a \"requester\") is willing to pay.'}),/*#__PURE__*/e(\"h3\",{children:\"Verification of results\"}),/*#__PURE__*/e(\"p\",{children:\"After final submissions by data annotators, the results should be aggregated and statistical tests should be run to ensure quality and accuracy.\"}),/*#__PURE__*/e(\"h2\",{children:\"Model evaluation before and after deployment\"}),/*#__PURE__*/e(\"p\",{children:'Machine learning models\\' performance or \"prediction\" evaluation is carried out by AI product developers both before deployment (aka post-training or initial model evaluation) and after deployment / in production (aka model monitoring). Since all AI downstream applications are made for end users who are real people, human-in-the-loop evaluation is considered the industry standard. For this reason, crowdsourcing for the purposes of ML model evaluation has become one of the most sought-after methodologies.'}),/*#__PURE__*/e(\"p\",{children:\"It works much the same as crowdsourced data labeling, except that the goal here is not to get the data ready for model fine-tuning, but rather to see whether the now trained model can perform well when encountering new and previously unseen data. Two routes can be taken:\"}),/*#__PURE__*/e(\"h3\",{children:\"Straight-up evaluation\"}),/*#__PURE__*/e(\"p\",{children:\"Data annotators are shown predictions (i.e., responses) made by the model, and they have to rate the model's precision and accuracy. For example, a model for Computer Vision may be required to name colors of different objects. The job of the annotators who are evaluating this model would be to say whether the model's labels corresponding to differently colored objects are correct, that is, whether this really is violet, this really is beige, and so on. This will provide enough information for ML engineers to understand how well their downstream application is doing.\"}),/*#__PURE__*/e(\"h3\",{children:\"Evaluation with fine-tuning\"}),/*#__PURE__*/e(\"p\",{children:\"This is a more elaborate version of the first option that entails two stages. The first one was to give human annotators new data \u2013 the same data that the model encountered post-training. Following our previous example, the annotators respond to this data and put different colors names to differently colored objects. The second stage is to compare their responses to the model's responses (for example, using a pairwise comparison or Side-by-Side).\"}),/*#__PURE__*/e(\"p\",{children:'The downside of this second option is that it takes a bit more time, but the greatest benefit is that it allows AI application developers not just to gauge their model\\'s performance \u2013 they can also fine-tune their ML model if necessary, because they now have an additional labeled dataset with high-quality responses provided by human annotators (i.e., \"golden sets\" or \"honeypots\").'}),/*#__PURE__*/e(\"h2\",{children:\"Bottom line\"}),/*#__PURE__*/e(\"p\",{children:\"A number of approaches that tackle the data-oriented parts of the ML value chain are utilized by AI product developers today. While the in-house route gives project managers more control, generally this approach is not time- and cost-effective. Outsourcing offers a viable alternative, with crowdsourcing being considered by many the fastest, the most affordable, and often the most reliable of all outsourcing options.\"}),/*#__PURE__*/e(\"p\",{children:\"Crowdsourcing can be utilized during data collection (spatial crowdsourcing, feet-on-street, or field tasks), during data labeling for a variety of domains and specific applications (e.g., NLP or CV), and during all types of performance evaluation \u2013 be it post-training evaluation or model monitoring in production. In addition, model evaluation through crowdsourcing can assist with further model fine-tuning whenever necessary.\"}),/*#__PURE__*/e(\"p\",{children:\"Importantly, crowdsourcing also allows ML practitioners, data scientists, and social science researchers to access a hard-to-reach demographic all over the world that remains largely inaccessible via any other means.\"})]});export const richText4=/*#__PURE__*/t(n.Fragment,{children:[/*#__PURE__*/e(\"p\",{children:\"People may find it easier to access and utilize texts for purposes beyond reading, such as quick information searches for instance, if they are digitized. An enormous quantity of texts has accumulated since the time of the invention of paper. Hence towards the end of the twentieth century, after it became apparent to people that the task of recognition for digitizing could only be achieved by employing automatic methods, optical character recognition (OCR) technology began to be actively developed.\"}),/*#__PURE__*/e(\"p\",{children:\"OCR examines scanned images of printed text and transforms those images into digital texts. Though the most sophisticated OCR models can identify almost every font type, they only work with printed text and dismiss handwritten data.\"}),/*#__PURE__*/e(\"p\",{children:\"To recognize handwritten text from images, use OCR software. Try recognizing handwritten text with a mobile application which has OCR features. Another solution is to scan handwritten text and use desktop or online OCR-powered applications. And finally, some scanners also have some kind of OCR software which you can use, be it a built-in software, or a downloadable tool provided by a hardware manufacturer.\"}),/*#__PURE__*/e(\"p\",{children:\"Handwritten text recognition (HTR) describes a computer-assisted automated approach to the deciphering of written records. This type of handwriting recognition would provide a great opportunity to automate the workflow of many businesses, thus simplifying the work of a human being. Both technologies are very similar, but OCR is already in an advanced state, whereas HTR is still in an early phase.\"}),/*#__PURE__*/e(\"p\",{children:\"The easiest and still widely employed text recognition process involves matrix matching: with each letter in the initial image decomposed into pixel matrixes and then correlated with the matrixes held in the computer. Once they match, the individual character is considered to be recognized. This method is referred to as pattern matching and is mostly utilized in the recognition of printed texts. To make it clearer, OCR recognizes all characters one by one by applying this method.\"}),/*#__PURE__*/e(\"p\",{children:\"For handwritten text and other rare or nonstandard fonts, the conventional comparison of pixel matrices may not be applicable at all. A slightly modified approach is employed in this case, namely the recognition of separate features, such as lines, curves, and other sections of letters. Such a method is also called feature extraction or feature detection and is utilized for the identification of both typed and written texts.\"}),/*#__PURE__*/e(\"h2\",{children:\"Modern text recognition technologies\"}),/*#__PURE__*/e(\"h3\",{children:\"Optical character recognition\"}),/*#__PURE__*/e(\"p\",{children:\"OCR is the process of retrieving text from a picture. An image of a page represents a digital copy of text and other possible content. They can be obtained by scanning or photographing paper documents, books, letters, and so on.\"}),/*#__PURE__*/e(\"p\",{children:\"Such images do not contain text available for editing yet. Instead, they are a set of pixels that collectively form a pattern of text. With recognition, the picture is processed into a text that can be edited on a PC, without having to retype it by hand. The images are converted into text using optical recognition technology.\"}),/*#__PURE__*/e(\"p\",{children:\"Technologies like Intelligent character recognition (ICR) and Intelligent word recognition (IWR) are advanced subtypes of the standard optical character recognition (OCR) systems. They target handwritten rather than printed text and are incorporated into most modern recognition systems.\"}),/*#__PURE__*/e(\"h3\",{children:\"Intelligent character recognition\"}),/*#__PURE__*/e(\"p\",{children:\"ICR is an improved OCR or more precisely a type of handwritten text detection. It deals with the recognition of separate handwritten typed characters. ICR recognition software operates with individual characters by splitting symbols into elements such as lines, curves, or loops, to identify exactly what kind of character it is.\"}),/*#__PURE__*/e(\"p\",{children:\"Although this method comes with its limitations, ICR tools recognize highly structured, that is, evenly arranged characters. Examples include forms such as a questionnaire in which a person writes information in the fields reserved for individual letters. This kind of questionnaire is found, for example, in tests, when the correct answer or letter must be written in the dedicated boxes.\"}),/*#__PURE__*/e(\"p\",{children:\"Modern ICR software often features a self-learning capability: a neural network that updates the recognition database automatically based on new handwriting styles. It expands the document processing capabilities of OCR and HTR. Nevertheless, ICR does not perform cursive handwriting recognition as it may only detect each individually written character so far.\"}),/*#__PURE__*/e(\"h3\",{children:\"Intelligent word recognition\"}),/*#__PURE__*/e(\"p\",{children:\"ICR has also got a kind of evolution of its own, which is called intelligent word recognition. It is utilized for character recognition with unstructured, freehand, or cursive handwriting. It attempts to distinguish the entire word rather than individual characters.\"}),/*#__PURE__*/e(\"p\",{children:\"This process is most applicable to the recognition of free-form handwritten notes since it is not individual characters that are identified, but whole coherent phrases or words. IWR wasn't designed to be a substitute for ICR and OCR, on the contrary, today's applications combine all three approaches.\"}),/*#__PURE__*/e(\"p\",{children:\"IWR is intended to recognize real texts written by humans in cursive, which is often hard to recognize. Those handwritten notes cannot be recognized by ICR due to the nature of the method. IWR greatly minimizes errors that arise from typical recognition systems as it matches handwritten or printed words to a user-defined dictionary.\"}),/*#__PURE__*/e(\"h2\",{children:\"Methods for recognizing human written texts\"}),/*#__PURE__*/e(\"p\",{children:\"Every handwritten letter, despite being written differently by each person, still comprises the same parts. However, there are many more options as to how handwritten characters, as opposed to printed ones, may look. Thus, each separate symbol represents a characteristic feature of a letter, and the main task is to find it in the initial text to recognize it.\"}),/*#__PURE__*/e(\"p\",{children:\"Such tasks are handled by neural networks. Neural networks are a type of machine learning process, consisting of many simple mathematical calculations of the same nature. They are actively employed today to convert scanned handwriting into printed text.\"}),/*#__PURE__*/e(\"p\",{children:\"Neural networks rely on machine learning, but first, they have to learn how to efficiently recognize text. They learn to find patterns using labeled data. The algorithms continuously process the input data, categorizing them over and over again until clear patterns are found.\"}),/*#__PURE__*/e(\"p\",{children:\"ML models that can handle such a challenge demand a considerable amount of learning data. In many cases, such input data has already been processed and is available today. As an example, the MNIST dataset, in particular, includes about 70 000 pictures of hand-written digits, with the recognition accuracy of the algorithms based on it being very high reaching over 99% for convolutional neural networks.\"}),/*#__PURE__*/e(\"p\",{children:\"Neural networks have the power to analyze vast amounts of information that a human being would not be able to process. They filter massive swarms of data at high speed, capturing patterns that would otherwise evade one's focus. There are numerous techniques in the neural network approach. The most popular are Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Hopfield's networks, and many others.\"}),/*#__PURE__*/e(\"p\",{children:\"It is not possible to directly program the behavior of neural networks, they only undergo a process of training, which can be referred to as their primary advantage. This is due to the fact that they are able to make predictions with a level of certainty without being told by a human programmer what to do in each specific situation.\"}),/*#__PURE__*/e(\"p\",{children:\"Nowadays, a significant amount of libraries for text recognition have been created. The application of these libraries greatly simplifies the development of handwriting recognition models. To enhance recognition accuracy, a dataset may be assembled for specific purposes, such as the characteristics of images or a specific language.\"}),/*#__PURE__*/e(\"h2\",{children:\"Handwriting recognition\"}),/*#__PURE__*/e(\"p\",{children:\"Modern models of printed text recognition deliver rather high-quality results, demonstrating the relatively error-free conversion of an input image to text. However, these results are due to a limited set of fonts, which aim to be as humanly comprehensible as possible.\"}),/*#__PURE__*/e(\"p\",{children:\"All typographic fonts have somewhat the same outline. More often than not, these are clearly legible and have only slight stylistic differences, for example, some people do not see the differences in Arial and Calibri fonts, even though stylistically they are not the same. However, technically, it is easier to teach the computer to recognize fonts of this type, because the shapes and symbols that make up the letters of these fonts are mostly similar.\"}),/*#__PURE__*/e(\"p\",{children:\"Handwritten text recognition is a more complicated matter. Everyone has their own handwriting, which may even change as time passes. The variability of handwriting patterns is quite substantial. One person may form their habits of writing this or that character in a certain way throughout their life, with only one person being able to write it that way.\"}),/*#__PURE__*/e(\"p\",{children:\"Aside from the fact that training a handwritten text recognition model involves creating a dataset, as mentioned earlier, which is already not an easy task on its own, there is also the difficulty of labeling such gathered information.\"}),/*#__PURE__*/e(\"p\",{children:\"For instance, sometimes recognizing a historical document requires a specialist who is knowledgeable in the ways people used to write. If the handwritten text is very intricate, it may require two or more people to interpret it and label each letter correctly. However, even for simple datasets, several annotations by multiple people must exist so that errors that annotators often make when trying to label handwritten text can be corrected.\"}),/*#__PURE__*/e(\"h2\",{children:\"How to convert scanned handwriting?\"}),/*#__PURE__*/e(\"p\",{children:\"With the appropriate software, you may easily convert handwritten text to printed one. Such recognition involves the following steps for converting scanned pictures or photos into text:\"}),/*#__PURE__*/e(\"h3\",{children:\"Image processing\"}),/*#__PURE__*/e(\"p\",{children:\"In order to convert scanned handwriting into printed text, the input image with text that is put into the system must be stripped of noise and converted to a form that enables efficient character extraction and detection. Generally, the image is enhanced, contrasted, straightened, and converted into the format used by the system.\"}),/*#__PURE__*/e(\"p\",{children:\"Thresholding binarization plays an essential role, which is the transformation of an image into black-and-white from a color or grayscale format. Such a conversion allows for a distinct separation of the text from the background, simplifies the further application of many algorithms, and also removes some noises from the image.\"}),/*#__PURE__*/e(\"h3\",{children:\"Highlighting the area of interest\"}),/*#__PURE__*/e(\"p\",{children:\"This step highlights the area of the image that contains the text to be recognized. In other words, a specialist has to detect handwriting in an image, while discarding elements that are not text. These include such objects as smudges and stains on the paper that were not removed during the binarization process.\"}),/*#__PURE__*/e(\"h3\",{children:\"Segmentation into lines and characters\"}),/*#__PURE__*/e(\"p\",{children:\"A text image has to be separated into lines, then the lines are divided into words and then into characters before the optical character recognition system can process each character individually.\"}),/*#__PURE__*/e(\"p\",{children:\"Since handwritten text, unlike typewritten text, is generally written following a certain curve, difficulties may arise in dividing the input.\"}),/*#__PURE__*/e(\"p\",{children:\"images of handwritten text into lines, which does not allow the algorithms suitable for typewritten texts to be applied directly. Lines may bend, or be too close together, and text elements belonging to different lines may overlap.\"}),/*#__PURE__*/e(\"p\",{children:\"Methods of baseline extraction attempt to trace some imaginary line along which a person writes, and then reconstruct a line from it. Following this step, different recognition systems employ their own unique algorithms.\"}),/*#__PURE__*/e(\"h3\",{children:\"Symbol processing\"}),/*#__PURE__*/e(\"p\",{children:\"The symbol image may be processed in its entirety by comparing it to available templates. Alternatively, the characteristics of the depicted symbol are extracted: the relevant features are selected and classified according to the criteria present in the application.\"}),/*#__PURE__*/e(\"h3\",{children:\"Recognition result\"}),/*#__PURE__*/e(\"p\",{children:\"The possible versions of the letter appear as output. Generally, however, the recognition system continues working through other methods, refining the achieved result. The recognition engine may not always follow all the mentioned steps, however, the basic actions of the recognition process are shared by all algorithms.\"}),/*#__PURE__*/e(\"h2\",{children:\"Making a handwriting recognition model\"}),/*#__PURE__*/e(\"p\",{children:\"For all of the steps described above to be possible with handwritten text, a trained handwriting recognition model has to be created. The following are the basic steps for creating such a model.\"}),/*#__PURE__*/e(\"h3\",{children:\"Data gathering\"}),/*#__PURE__*/e(\"p\",{children:\"The first thing specialists have to do is to collect a training data set containing images with words with different handwriting in the language they plan to work with. It may include photos, scans of handwritten notes, scanned documents, letters, and so on.\"}),/*#__PURE__*/e(\"p\",{children:\"Model developers can use ready-made datasets, of which there is now a large number and many are freely available. Alternatively, they can build their own datasets. For example, they can distribute dedicated word-writing forms to a large group of people, such as students or colleagues, in order to cover as many handwritings as possible.\"}),/*#__PURE__*/e(\"p\",{children:\"A faster solution would be to gather training data through crowdsourcing. This relatively new approach is an effective tool for gathering vast amounts of data. On crowdsourcing platforms, the customer gives the assignment to a large group of people, most often freelancers, who will submit images of handwritten text for a small fee.\"}),/*#__PURE__*/e(\"h3\",{children:\"Annotation\"}),/*#__PURE__*/e(\"p\",{children:\"Graphic images of a document, including those written by hand, do not represent a text document yet. The human brain is designed in such a way that it is enough just to look at a sheet of paper with text to understand what is written on it (depending on the handwriting of course, some are incomprehensible even to humans). From the computer's point of view, a scanned document is just a set of colored dots and does not look like a text document at all. The model cannot extract relevant features on its own.\"}),/*#__PURE__*/e(\"p\",{children:\"Therefore, the data collected has to be annotated, because the model cannot learn to identify letters in the image on its own. Instead, it needs to be shown how the handwritten symbol corresponds to the printed letter so that in the future it can extract text from handwritten notes and help people recognize similar characters. As already mentioned, it takes more than one person to get a better result and avoid annotation errors.\"}),/*#__PURE__*/e(\"p\",{children:\"This is where crowdsourcing comes in handy again. The final decision on assigning a particular letter to an image is reached through the agreement of volunteers scattered who are located all over the world. To filter out unscrupulous contributors, it is essential to create high-quality control tasks according to which a person's expertise may be evaluated.\"}),/*#__PURE__*/e(\"h3\",{children:\"Model training\"}),/*#__PURE__*/e(\"p\",{children:\"Once all photos, scans, and documents containing human written text have labels, specialists can begin training the model. As a final result of the training, the recognition model has to be able to provide a reliable output: a text file in a digital format.\"}),/*#__PURE__*/e(\"p\",{children:\"Moreover, the text must be of high quality, that is, just a set of incoherent letters will not do. If this is the case, ML specialists may conclude that either the dataset was of poor quality, it was improperly labeled, or the training process was faulty. Ideally, all cases of duplicate characters, repeated characters, and unrecognized ones should be solved.\"}),/*#__PURE__*/e(\"p\",{children:\"One way or another, once recognition problems are detected, specialists will have to start over by carefully examining all steps of model preparation to calculate exactly at what stage the failure in model preparation occurred.\"}),/*#__PURE__*/e(\"h3\",{children:\"Quality assurance and model monitoring\"}),/*#__PURE__*/e(\"p\",{children:\"After achieving high-quality recognition of written text by the model, developers must not forget continuous quality control. This step is necessary to guarantee that the model's performance is always of excellent quality and that it delivers the best possible result over a long period of time.\"}),/*#__PURE__*/e(\"p\",{children:\"Furthermore, this stage of work can indicate to the development team that the model has to be further trained with additional datasets containing new images of various handwriting.\"}),/*#__PURE__*/e(\"h2\",{children:\"Applications of handwritten text detection\"}),/*#__PURE__*/e(\"p\",{children:\"OCR and HTR systems are employed in countless fields. Some of the tasks that text recognition systems solve are as follows:\"}),/*#__PURE__*/t(\"ul\",{style:{\"--framer-font-size\":\"18px\",\"--framer-text-alignment\":\"start\",\"--framer-text-color\":\"rgb(30, 33, 38)\",\"--framer-text-stroke-width\":\"0px\",\"--framer-text-transform\":\"none\"},children:[/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Generation of digital versions of printed and handwritten documents.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Data reading on forms and questionnaires.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Automated vehicle license plate recognition.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Technology to assist blind and visually impaired individuals.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"ID documents data recognition.\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Extraction of information from business cards into contact lists.\"})})]}),/*#__PURE__*/e(\"p\",{children:\"Handwritten text detection may also be used for quick editing of one's notes and memos. When you write notes in class, you may take a picture of them and generate text on your computer that can be edited and modified. Handwriting recognition simplifies and speeds up the paperwork in hospitals, and government institutions that provide services for citizens. For writers who handwrite their books on paper and then retype the finished text, this automated process can make the job a lot easier.\"}),/*#__PURE__*/e(\"p\",{children:\"Such technology can simplify the job of historians who decipher historical documents that are written by hand. Some projects involve deciphering old books and ancient manuscripts. People decode photos or scans of such books by hand, which is often a complicated process. Very few people know how to do this. If a computer could do it, it could speed up the process dramatically. Machine learning could make this job a lot easier.\"}),/*#__PURE__*/e(\"p\",{children:\"These are certainly not all areas of application for handwritten text detection. So, the development of handwriting recognition technologies that can greatly simplify the process of data entry is an essential task for many users.\"}),/*#__PURE__*/e(\"h2\",{children:\"Summing up\"}),/*#__PURE__*/e(\"p\",{children:\"Currently, quite a few types of OCR systems exist. However, only some of them can recognize handwriting. Recognition systems with high speed and accuracy are typically very expensive to create, which makes them hardly available for mass implementation of online OCR other than the existing ones from major players.\"}),/*#__PURE__*/e(\"p\",{children:\"Text recognition can significantly improve and simplify the work of many people in various fields and institutions. Some development teams have already achieved an impressive level of handwriting recognition. However, many software solutions that are commonly available on the market do not fully solve the given task. Therefore, the challenge of developing a system that is accessible to a wide range of people and that enables the recognition of handwritten characters remains urgent. Through the advancements in the field of machine learning and neural networks, this task does not seem to be unfeasible.\"}),/*#__PURE__*/e(\"h2\",{children:\"About Toloka\"}),/*#__PURE__*/e(\"p\",{children:/*#__PURE__*/e(\"em\",{children:\"Toloka is a European company based in Amsterdam, the Netherlands that provides data for Generative AI development. Toloka empowers businesses to build high quality, safe, and responsible AI. We are the trusted data partner for all stages of AI development from training to evaluation. Toloka has over a decade of experience supporting clients with its unique methodology and optimal combination of machine learning technology and human expertise, offering the highest quality and scalability in the market.\"})})]});export const richText5=/*#__PURE__*/t(n.Fragment,{children:[/*#__PURE__*/e(\"p\",{children:\"Machine learning (ML) is a part of the artificial intelligence field. It is a powerful tool for data analysis, but its work and output is only as good as the initial dataset that drives it. Data-driven culture these days is one of the core parts of machine learning projects, and having enough data for specific ML purposes is arguably the most crucial aspect in this field.\"}),/*#__PURE__*/e(\"p\",{children:\"In this article, we will provide a brief overview on how to create a dataset for ML purposes and make it useful for particular ML tasks. By the end, you will have a high-level understanding of what goes into generating the right data that drives every ML algorithm there is. Let's get into it.\"}),/*#__PURE__*/e(\"h2\",{children:\"What are machine learning datasets?\"}),/*#__PURE__*/e(\"p\",{children:\"A machine learning dataset is a collection of data that trains and evaluates an ML model. Creating a good dataset for machine learning is a critical step in the process of training and evaluating ML models. In order to create an effective dataset, it is important to understand how to generate data for machine learning and what data is needed.\"}),/*#__PURE__*/e(\"p\",{children:\"The quality and size of the dataset play a crucial role in determining the accuracy and performance of the model. In general, the more data the model has access to, the better it will perform. However, it is important to strike a balance between the amount of data stored for processing and the computational resources required to process it.\"}),/*#__PURE__*/e(\"p\",{children:\"ML is a branch of artificial intelligence that lets computer systems learn and make predictions based on data without being explicitly programed to do so. This methodology can easily solve a wide range of complex problems, such as image and speech recognition, natural language processing, predictive maintenance, fraud detection, and powering recommendation systems. ML algorithms analyze vast amounts of data to identify patterns and make predictions with a high degree of accuracy.\"}),/*#__PURE__*/e(\"p\",{children:\"For all those operational advantages, machine learning still requires good data up front in order to work as effectively as possible. Furthermore, that data then needs to be organized in a way that an ML algorithm can understand in order to complete its tasks.\"}),/*#__PURE__*/e(\"h2\",{children:\"What are the steps?\"}),/*#__PURE__*/e(\"h3\",{children:\"Training data collection\"}),/*#__PURE__*/e(\"p\",{children:\"Creating a dataset for machine learning (ML) is an important step in the ML development process because it drives everything the algorithm outputs. Data sets load up an algorithm with the critical mass of clean and processed information it needs in order to work.\"}),/*#__PURE__*/e(\"p\",{children:\"The larger and cleaner a dataset is, the more effective the ML algorithm can become. Thus, gathering as much data as possible while keeping it relevant and balancing it all with your hardware capabilities is an ongoing thing in the machine learning process.\"}),/*#__PURE__*/e(\"p\",{children:\"So step one is to gather data properly from the beginning. If you already have data in paper ledgers or in .xlsx or .csv files, you may face challenges with digital data preparation. On the other hand, if you have a small dataset that is already friendly to ML, you're in a better position.\"}),/*#__PURE__*/e(\"h3\",{children:\"Data storage\"}),/*#__PURE__*/e(\"p\",{children:'Data can be stored several different ways, from on physical hard drives to in the cloud. There\\'s even a popular big data storage solution known as the \"data lake,\" a repository of vast amounts of unstructured data. Data lakes can be built on top of commercial versions of Apache Hadoop, third-party cloud solutions, or ready-made products purchased from specialized vendors.'}),/*#__PURE__*/e(\"h3\",{children:\"Existing dataset, synthetic data, or data collection?\"}),/*#__PURE__*/e(\"p\",{children:\"If you're just starting out and don't have data, there are large open source datasets available to you that make a good starting place. Public datasets are a valuable resource for anyone interested in machine learning and data analysis \u2014 they come from businesses and organizations that share their own open data with the public. These datasets can contain a wide range of information from various aspects of life, from healthcare records to weather patterns and transportation metrics to hardware utilization data.\"}),/*#__PURE__*/e(\"p\",{children:\"Sometimes companies employ synthetic data generation, which implies artificially generated rather than real life datasets produced by real events. However, at the step of initial model training, synthetic data is not the best option.\"}),/*#__PURE__*/e(\"p\",{children:\"To properly train a model, the data should represent the real world, and a synthetic dataset can be prone to distortion. Synthetic datasets should rather be used to validate machine learning models' results later.\"}),/*#__PURE__*/e(\"p\",{children:\"Whatever your specific niche is, there is probably a useful public dataset out there for you. While these data sets may not provide loads of information about your specific business or its operations, they can still offer valuable insights into your industry and its niche, as well as your customer segments.\"}),/*#__PURE__*/e(\"p\",{children:\"However, the real value in machine learning comes from collecting your own robust datasets that are specific to your business needs and activities, and then using that to drive your algorithm. There's nothing quite as good as a purpose-built solution to a problem, so the data set built in-house for the purposes of your machine learning project will mostly always be better than a public data set available to anyone.\"}),/*#__PURE__*/e(\"p\",{children:\"When deciding between using a ready-made dataset or collecting your own data, it's important to consider your goals and the quality of available datasets. If your goals require unique, specific data that's not already out there, creating datasets is likely the best way. But using a ready-made dataset can also save time and resources.\"}),/*#__PURE__*/e(\"p\",{children:\"You should also consider the quality of the data, as collecting on your own might require additional effort to clean and process the data before it gets used.\"}),/*#__PURE__*/e(\"p\",{children:\"Next, depending on the ML approach you're using, you need to decide whether to label your data, and how exactly you want to do it. See our blog post to learn more about data labeling.\"}),/*#__PURE__*/e(\"h3\",{children:\"Data preparation\"}),/*#__PURE__*/e(\"p\",{children:\"Machine learning helps organizations make data-driven decisions and automate tasks that would otherwise require lots of manual effort. Its power lies in its ability to continuously learn and improve its performance over time, making it a highly valuable tool for solving complex problems.\"}),/*#__PURE__*/e(\"p\",{children:\"Unfortunately, datasets are often flawed in various ways that can impact the accuracy and performance of machine learning models.\"}),/*#__PURE__*/e(\"p\",{children:\"Some common flaws include:\"}),/*#__PURE__*/t(\"ul\",{style:{\"--framer-font-size\":\"18px\",\"--framer-text-alignment\":\"start\",\"--framer-text-color\":\"rgb(30, 33, 38)\",\"--framer-text-stroke-width\":\"0px\",\"--framer-text-transform\":\"none\"},children:[/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Imbalanced classes (one class of data points significantly outnumbers another class).\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Missing data values (which cause problems with the model's accuracy and generalization capabilities).\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Noisy data (irrelevant or incorrect information that negatively impacts a machine learning model's performance).\"})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/e(\"p\",{children:\"Outliers (extremely high or low values that skew results).\"})})]}),/*#__PURE__*/e(\"p\",{children:\"To overcome these issues and more, data scientists need to clean and prepare the input that drives an ML algorithm's output. This ensures that the data model is reliable and will perform well.\"}),/*#__PURE__*/e(\"h3\",{children:\"Quality control\"}),/*#__PURE__*/e(\"p\",{children:\"Evaluating the quality of your data is crucial in creating a dataset for machine learning that will yield accurate and meaningful results. Here are some good questions to ask for determining the viability of your dataset:\"}),/*#__PURE__*/t(\"ul\",{style:{\"--framer-font-size\":\"18px\",\"--framer-text-alignment\":\"start\",\"--framer-text-color\":\"rgb(30, 33, 38)\",\"--framer-text-stroke-width\":\"0px\",\"--framer-text-transform\":\"none\"},children:[/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Is your data appropriate for your task?\"}),\" For example, if you've been selling home appliances in the US, can you use the same data to predict stock and demand in Europe?\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Is your data balanced?\"}),\" If you have a large number of labeled data points for one class and only a few for another, your machine learning model may struggle to learn about the underrepresented class.\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Is your data trustworthy?\"}),\" Mistakes in data collection or labeling can impact the accuracy of your dataset, so quality control mechanisms need to be added to your collection and labeling pipelines. Multiple datasets contradicting each other might decrease the quality of model training.\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Have there been any technical issues when transferring data?\"}),\" For example, parts of the data might get duplicated or go missing due to things like server errors or a cyberattack.\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"How many missing values does your data have?\"}),\" Such values can make it harder to use your dataset for machine learning.\"]})})]}),/*#__PURE__*/e(\"p\",{children:\"The success of a machine learning algorithm depends heavily on the quality of the data that drives it. Make sure the data is appropriate, balanced, trustworthy, and free of errors or technical issues. By addressing these problems early on, your machine learning models will yield meaningful and accurate results.\"}),/*#__PURE__*/e(\"h3\",{children:\"Formatting, cleaning, and reducing data\"}),/*#__PURE__*/e(\"p\",{children:\"There are three main steps that go into creating a quality dataset: formatting data, cleaning data, and reducing data.\"}),/*#__PURE__*/e(\"p\",{children:\"Formatting data is about making sure that the data within a given attribute is expressed consistently. Are all the dates and addresses written in the same file format? Does every dollar amount come with a dollar sign ($) or not? Input formats must be the same across the entire dataset.\"}),/*#__PURE__*/e(\"p\",{children:\"Data cleaning calls for removing any missing, erroneous, or less-representative values in the dataset to improve an ML algorithm's accuracy. There are several methods for cleaning training data, including substituting missing values with dummy values, using mean figures, or using the most frequent items. Some ML-as-a-service platforms can help automate this data cleaning process.\"}),/*#__PURE__*/e(\"p\",{children:\"Reducing data is about shrinking the overall size of a dataset by removing any irrelevant or unnecessary information.\"}),/*#__PURE__*/e(\"p\",{children:'\"Big data\" has been a popular business term for several years now and is often seen as the goal for ML, but having petabytes of data on hand doesn\\'t automatically lead to insights. In fact, a dataset that is large but not \"clean\" will often be more difficult for deriving valuable insights.'}),/*#__PURE__*/e(\"p\",{children:\"If you don't already have a data scientist on your team, this is probably the time to engage one. Domain expertise is important for determining which values should be included and which can be skipped. Appropriately reducing the size of the dataset improves the speed of computing time without sacrificing prediction accuracy.\"}),/*#__PURE__*/e(\"p\",{children:\"By ensuring consistent data formatting, removing any missing or erroneous values, and shrinking the dataset size by only keeping relevant information, the end result is a dataset that's more useful in machine learning algorithms.\"}),/*#__PURE__*/e(\"h3\",{children:\"Define new connections between various types of data\"}),/*#__PURE__*/e(\"p\",{children:'It\\'s important to capture specific relationships in your machine learning dataset. One way to do this is by \"decomposing\" complex values into multiple parts. This process is a bit like the opposite of reducing data \u2014 it involves adding new attributes based on the existing attributes.'}),/*#__PURE__*/e(\"p\",{children:\"For example, if your sales performance varies based on the day of the week, separating the day into a separate categorical value from the date can provide the algorithm with more relevant information.\"}),/*#__PURE__*/e(\"p\",{children:\"You also might have different types of data gathered from different data sources. Joining transactional data with attribute data also enhances the predictive power of your ML analysis.\"}),/*#__PURE__*/e(\"p\",{children:\"\\\"Transactional data\\\" refers to info about a specific moment, such as the price of a product at the time a user clicks the buy button. Attribute data is more static, however, and doesn't directly relate to specific events, such as a user's age or demographics. Both can be used as training data, depending on your goals.\"}),/*#__PURE__*/e(\"p\",{children:\"Suppose you're tracking sensor readings to predict maintenance needs for industrial machinery. Transactional data, like log files, can be combined with attribute data, like the equipment model, batch, and location, in order to find dependencies between equipment behavior and attributes.\"}),/*#__PURE__*/e(\"p\",{children:'Interpreting transactional data to define attributes can also be useful. If you manually analyzed website session logs of individual visitors, you might assign attributes to them like \"window shopper\" or \"instant buyer.\" That new attribute data can help optimize retargeting campaigns or predict a customer\\'s lifetime value.'}),/*#__PURE__*/e(\"h3\",{children:\"Rescaling\"}),/*#__PURE__*/e(\"p\",{children:\"Data rescaling is the process of improving a dataset by reducing its dimensions and avoiding situations where some values outweigh others. It helps make ML-driven predictions more accurate.\"}),/*#__PURE__*/e(\"p\",{children:'Suppose you have a dataset with attributes such as car model, body style, years of use, and price. The price attribute will have larger numbers associated with it, and will \"weigh\" more than the other attributes.'}),/*#__PURE__*/e(\"p\",{children:'Rescaling this dataset would call for evening out the weight of the price attribute. You can use a technique called \"min-max normalization\" to transform numerical values into a range from 0.0 to 1.0, where 0.0 represents the minimum and 1.0 the maximum values.'}),/*#__PURE__*/e(\"p\",{children:'A simpler rescaling approach is called \"decimal scaling,\" which involves changing data size by moving a decimal point in either direction.'}),/*#__PURE__*/e(\"h3\",{children:\"Discreticizing\"}),/*#__PURE__*/e(\"p\",{children:\"Discretizing data involves converting numerical values into categorical values, which can simplify the work for an algorithm and make predictions more relevant.\"}),/*#__PURE__*/e(\"p\",{children:\"For example, if you're tracking customer ages, you won't be particularly concerned with the difference between a 14-year-old's purchases and a 15-year-old's purchases \u2014 they can be safely lumped together in a category that includes all teenagers, for example. Discretizing is about turning numbers into qualitative categories.\"}),/*#__PURE__*/e(\"p\",{children:\"Rescaling and discretizing your data helps improve a dataset so that an ML algorithm can make more accurate predictions.\"}),/*#__PURE__*/e(\"h2\",{children:\"What a strong ML team looks like\"}),/*#__PURE__*/e(\"p\",{children:\"For being such a computer-based pursuit, machine learning actually calls for quite a bit of human involvement up front. We've already mentioned the importance of having a good data scientist on board for your machine learning purposes, but that shouldn't be the only professional involved in creating your dataset.\"}),/*#__PURE__*/e(\"p\",{children:\"Let's run through some of the important human roles that go into finalizing a dataset:\"}),/*#__PURE__*/t(\"ul\",{style:{\"--framer-font-size\":\"18px\",\"--framer-text-alignment\":\"start\",\"--framer-text-color\":\"rgb(30, 33, 38)\",\"--framer-text-stroke-width\":\"0px\",\"--framer-text-transform\":\"none\"},children:[/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Data engineer\"}),\": designs and maintains the dataset's architecture, ensures data is stored securely and efficiently.\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Data collection/entry operator\"}),\": collects and enters data into databases, ensures data is entered accurately following established procedures and standards.\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Quality assurance/control specialist\"}),\": ensures accuracy, completeness, and consistency of data, develops and implements data quality checks, regularly audits data integrity.\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Data analyst\"}),\": prepares, cleans, and organizes data for analysis, performs exploratory analysis looking for patterns and relationships in the data, communicates those findings.\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Data scientist\"}),\": analyzes, processes, and models data for the purpose of gaining insight and making predictions, develops and implements machine learning algorithms and statistical models that solve complex business problems.\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Machine learning engineer\"}),\": develops and deploys machine learning models into production environments, works closely with data scientists to understand their needs.\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Subject matter expert\"}),\": provides domain knowledge and understanding to the data science team, frames the problem, identifies important variables, and validates data analysis results.\"]})}),/*#__PURE__*/e(\"li\",{\"data-preset-tag\":\"p\",children:/*#__PURE__*/t(\"p\",{children:[/*#__PURE__*/e(\"em\",{children:\"Data annotator\"}),\": more often than not, it's not enough just to collect data. Training an ML model requires a labeled dataset. A data annotator is a person who manually adds labels to data, to help train the model and monitor the quality of its output.\"]})})]}),/*#__PURE__*/e(\"p\",{children:\"This list may run quite a bit longer, depending on the team. Other job titles worth mentioning here include statisticians, data visualization specialists, project managers, technical writers, and even an ethics review board.\"}),/*#__PURE__*/e(\"h2\",{children:\"Does it always have to be a big team?\"}),/*#__PURE__*/e(\"p\",{children:\"Some of these roles can be carried out by the same person. Same as in many other fields, in data science the differences between roles can be blurry.\"}),/*#__PURE__*/e(\"p\",{children:\"Sometimes an ML engineer who makes algorithm training possible, a software engineer who deploys the model, and even a data annotator who manually labels data are the same person, especially in smaller companies and startups. This situation in particular provides specialists with great experience in various fields, and gives them the opportunity to use their expertise from different fields to achieve the best results.\"}),/*#__PURE__*/e(\"p\",{children:\"From creating stable data streams, to data preprocessing, data augmentation, validation, labeling, and quality assurance, all these roles are important. Working together, these are the humans that build intelligent software that can learn and improve itself over time. Be it a large team, or only a few people, they can come up with the most effective ways on how to create a dataset for machine learning.\"}),/*#__PURE__*/e(\"h2\",{children:\"TL;DR\"}),/*#__PURE__*/e(\"p\",{children:\"Creating a machine learning dataset is a vital step in the larger ML process. That data directly impacts the accuracy and performance of the model, so it's important to collect raw data properly and store it suitably.\"}),/*#__PURE__*/e(\"p\",{children:\"The decision to use an existing dataset or to dive into dataset creation on your own depends on your specific business goals and the quality of existing datasets already out there.\"}),/*#__PURE__*/e(\"p\",{children:\"Data preparation and quality control are also important here. These practices ensure that only clean and accurate data goes into the model, so that it's trained in the best conditions possible to provide relevant results.\"}),/*#__PURE__*/e(\"p\",{children:\"With a well-prepared dataset, machine learning algorithms can analyze vast amounts of data to identify patterns and make accurate predictions.\"})]});\nexport const __FramerMetadata__ = {\"exports\":{\"richText1\":{\"type\":\"variable\",\"annotations\":{\"framerContractVersion\":\"1\"}},\"richText4\":{\"type\":\"variable\",\"annotations\":{\"framerContractVersion\":\"1\"}},\"richText2\":{\"type\":\"variable\",\"annotations\":{\"framerContractVersion\":\"1\"}},\"richText\":{\"type\":\"variable\",\"annotations\":{\"framerContractVersion\":\"1\"}},\"richText5\":{\"type\":\"variable\",\"annotations\":{\"framerContractVersion\":\"1\"}},\"richText3\":{\"type\":\"variable\",\"annotations\":{\"framerContractVersion\":\"1\"}},\"__FramerMetadata__\":{\"type\":\"variable\"}}}"],
  "mappings": "0LAAsJ,IAAMA,EAAsBC,EAAIC,EAAS,CAAC,SAAS,CAAcC,EAAE,IAAI,CAAC,SAAS,iUAA4T,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,qMAA2L,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0eAA0e,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,6CAA6C,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gJAAgJ,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2dAA2d,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kNAAkN,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2WAAsW,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,gEAA2D,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,8GAA8G,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6DAAwD,CAAC,EAAeF,EAAE,KAAK,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,kLAA6K,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,yHAAyH,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,kJAAkJ,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mCAAmC,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,cAAc,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+UAA+U,CAAC,EAAeF,EAAE,KAAK,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,kUAA6T,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,0EAAqE,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,sEAAsE,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kEAAkE,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,SAAS,CAAC,SAAS,gBAAgB,CAAC,EAAE,wHAAwH,CAAC,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,SAAS,CAAC,SAAS,YAAY,CAAC,EAAE,0LAA0L,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,gBAAgB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+OAA+O,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,uGAAuG,CAAC,EAAeF,EAAE,KAAK,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,oLAA+K,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,uCAAkC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,qIAAqI,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6DAA6D,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,SAAS,CAAC,SAAS,YAAY,CAAC,EAAE,0HAA0H,CAAC,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,SAAS,CAAC,SAAS,aAAa,CAAC,EAAE,8IAA8I,CAAC,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,SAAS,CAAC,SAAS,0BAA0B,CAAC,EAAE,oKAAoK,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,6CAA6C,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+NAA+N,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sUAAsU,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAC,sBAAmCE,EAAE,KAAK,CAAC,SAAS,0BAA0B,CAAC,EAAE,0OAA0O,CAAC,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAC,+CAA4DE,EAAE,KAAK,CAAC,SAAS,wBAAwB,CAAC,EAAE,kHAAkH,CAAC,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAC,uDAAoEE,EAAE,KAAK,CAAC,SAAS,yBAAyB,CAAC,EAAE,8OAA8O,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,mFAAmF,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,uFAAuF,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,qVAAqV,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,iRAA4Q,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,6CAA6C,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,8RAA8R,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6RAA6R,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,yJAAyJ,CAAC,EAAeF,EAAE,KAAK,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,2BAA2B,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,iDAAiD,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,gCAAgC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,oEAAoE,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,qKAAqK,CAAC,EAAeF,EAAE,KAAK,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,wEAAwE,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,sFAAsF,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,8EAA8E,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,MAAM,CAAC,EAAE,iJAAiJ,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,uOAAkO,CAAC,EAAeF,EAAE,KAAK,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,kGAAkG,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,uHAAuH,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,uEAAuE,CAAC,EAAeF,EAAE,KAAK,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,gCAAgC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,iCAAiC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,mDAAmD,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kNAAkN,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,SAAS,CAAC,SAAS,iBAAiB,CAAC,EAAE,wJAAwJ,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sQAA4P,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+SAA+S,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,SAAS,CAAC,SAAS,6BAA6B,CAAC,EAAE,qIAAqI,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,yPAAyP,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,iOAAiO,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,SAAS,CAAC,SAAS,kBAAkB,CAAC,EAAE,+EAA+E,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,yeAAoe,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,yBAAyB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,uFAAuF,CAAC,EAAeF,EAAE,KAAK,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,sQAAsQ,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,wNAAwN,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,2LAA2L,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,kFAAkF,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,sCAAsC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kXAAkX,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wbAAwb,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wUAAwU,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAC,mYAA2YE,EAAEC,EAAE,CAAC,KAAK,4CAA4C,YAAY,GAAG,OAAO,YAAY,aAAa,GAAG,UAAU,CAAC,EAAE,QAAQ,oBAAoB,aAAa,GAAG,SAAsBD,EAAEE,EAAE,EAAE,CAAC,SAAS,wBAAwB,CAAC,CAAC,CAAC,EAAE,iBAAiB,CAAC,CAAC,EAAeF,EAAE,KAAK,CAAC,SAAS,iBAAY,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sNAAiN,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2IAA2I,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0WAAqW,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0RAAqR,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAC,cAA2BE,EAAEC,EAAE,CAAC,KAAK,4CAA4C,YAAY,GAAG,OAAO,YAAY,aAAa,GAAG,UAAU,CAAC,EAAE,QAAQ,oBAAoB,aAAa,GAAG,SAAsBD,EAAEE,EAAE,EAAE,CAAC,SAAS,wBAAwB,CAAC,CAAC,CAAC,EAAE,idAA4c,CAAC,CAAC,EAAeJ,EAAE,IAAI,CAAC,SAAS,CAAC,mHAAgIE,EAAEC,EAAE,CAAC,KAAK,0BAA0B,YAAY,GAAG,OAAO,YAAY,aAAa,GAAG,UAAU,CAAC,EAAE,QAAQ,oBAAoB,aAAa,GAAG,SAAsBD,EAAEE,EAAE,EAAE,CAAC,SAAS,MAAM,CAAC,CAAC,CAAC,EAAE,GAAG,CAAC,CAAC,EAAeF,EAAE,KAAK,CAAC,SAAS,cAAc,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAsBA,EAAE,KAAK,CAAC,SAAS,4fAA4f,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeG,EAAuBL,EAAIC,EAAS,CAAC,SAAS,CAAcC,EAAE,IAAI,CAAC,SAAS,qQAAqQ,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sWAAsW,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2OAA2O,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6XAA6X,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6UAA6U,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mWAAmW,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0gBAA0gB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,meAAme,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,kDAAkD,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,oRAAoR,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kVAAkV,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2GAA2G,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAsBA,EAAE,SAAS,CAAC,SAAS,iBAAiB,CAAC,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2TAA2T,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAsBA,EAAE,SAAS,CAAC,SAAS,+CAA+C,CAAC,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,iXAAiX,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,oPAAoP,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6SAA6S,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAsBA,EAAE,SAAS,CAAC,SAAS,gBAAgB,CAAC,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+JAA+J,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAsBA,EAAE,SAAS,CAAC,SAAS,SAAS,CAAC,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+QAA+Q,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0NAA0N,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mZAAqZ,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAsBA,EAAE,SAAS,CAAC,SAAS,uBAAuB,CAAC,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wQAAwQ,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,ogBAAqgB,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,mCAAmC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,iKAAiK,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,qVAAqV,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,yMAAyM,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,8BAA8B,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sUAAsU,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,8OAA8O,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gMAAgM,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,oUAAoU,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kgBAAkgB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6SAA6S,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,qBAAqB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,4TAA4T,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,8UAA8U,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2hBAA2hB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kSAAkS,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,8DAA8D,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wdAAwd,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mnBAAmnB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,urBAAurB,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,0CAA0C,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6FAA6F,CAAC,EAAeF,EAAE,KAAK,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,kBAAkB,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,mBAAmB,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,wBAAwB,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,oCAAoC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,kBAAkB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,yGAAyG,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,uBAAuB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,qgBAAqgB,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,SAAS,CAAC,SAAS,YAAY,CAAC,EAAE,oRAAoR,CAAC,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,SAAS,CAAC,SAAS,aAAa,CAAC,EAAE,yRAAyR,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wPAAwP,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,SAAS,CAAC,SAAS,0BAA0B,CAAC,EAAE,+PAA+P,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,qBAAqB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6rBAAmrB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,4VAA4V,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,SAAS,CAAC,SAAS,gBAAgB,CAAC,EAAE,glBAAglB,CAAC,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,SAAS,CAAC,SAAS,YAAY,CAAC,EAAE,mUAAmU,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,mBAAmB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0bAA0b,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,qVAAqV,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,SAAS,CAAC,SAAS,UAAU,CAAC,EAAE,6VAA6V,CAAC,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,SAAS,CAAC,SAAS,SAAS,CAAC,EAAE,qRAAqR,CAAC,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,SAAS,CAAC,SAAS,UAAU,CAAC,EAAE,+bAA+b,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,wBAAwB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mdAAmd,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0lBAA0lB,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,iBAAiB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kRAAkR,CAAC,EAAeF,EAAE,KAAK,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,wCAAwC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,mEAAmE,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,2DAA2D,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6VAA6V,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gcAAgc,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mVAAmV,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,4aAA4a,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,eAAe,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0OAA0O,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+TAA+T,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,qfAAqf,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,YAAY,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,8kBAA8kB,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,cAAc,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAsBA,EAAE,KAAK,CAAC,SAAS,4fAA4f,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeI,EAAuBN,EAAIC,EAAS,CAAC,SAAS,CAAcC,EAAE,IAAI,CAAC,SAAS,yNAAyN,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,yXAAyX,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wYAAmY,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,iTAAiT,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kVAA6U,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,+CAA+C,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,oLAAoL,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,mBAAmB,CAAC,EAAE,qWAAsW,CAAC,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAC,yCAAsDE,EAAE,KAAK,CAAC,SAAS,iBAAiB,CAAC,EAAE,4PAAyQA,EAAE,KAAK,CAAC,SAAS,kBAAkB,CAAC,EAAE,sFAAsF,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,ieAA4d,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAC,mBAAgCE,EAAE,KAAK,CAAC,SAAS,qBAAqB,CAAC,EAAE,2cAA2c,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6aAA6a,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAC,gDAA6DE,EAAE,KAAK,CAAC,SAAS,iBAAiB,CAAC,EAAE,mcAA8b,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,oCAAoC,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAC,wZAAqaE,EAAEC,EAAE,CAAC,KAAK,4CAA4C,YAAY,GAAG,OAAO,YAAY,aAAa,GAAG,UAAU,CAAC,EAAE,QAAQ,oBAAoB,aAAa,GAAG,SAAsBD,EAAEE,EAAE,EAAE,CAAC,SAAS,gBAAgB,CAAC,CAAC,CAAC,EAAE,yJAAoJ,CAAC,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,4mBAAumB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6iBAA6iB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wRAAmR,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,uDAAuD,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gRAAgR,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gWAA2V,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,uEAAuE,CAAC,EAAeF,EAAE,KAAK,CAAC,MAAM,CAAC,qBAAqB,OAAO,0BAA0B,QAAQ,sBAAsB,kBAAkB,6BAA6B,MAAM,0BAA0B,MAAM,EAAE,SAAS,CAAcE,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,4BAA4B,CAAC,EAAE,8UAAyU,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,2BAA2B,CAAC,EAAE,sNAAiN,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,sBAAsB,CAAC,EAAE,iLAA4K,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,6BAA6B,CAAC,EAAE,gLAA2K,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,0BAA0B,CAAC,EAAE,6NAAwN,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,8BAA8B,CAAC,EAAE,iVAA4U,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mIAAmI,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,iCAAiC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+IAA+I,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kEAAkE,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,iCAAiC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kWAAkW,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,wBAAwB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,4LAA4L,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,mBAAmB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gOAAgO,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,8BAA8B,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kWAAkW,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,kCAAkC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,qOAAqO,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,kBAAkB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,oMAAoM,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,cAAc,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0OAA0O,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,mCAAmC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6MAA6M,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,0BAA0B,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+PAA+P,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,2BAA2B,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mMAAmM,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,wCAAwC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,uNAAuN,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,yCAAyC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wOAAwO,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gJAAgJ,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,0CAA0C,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mfAAmf,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wEAAwE,CAAC,EAAeF,EAAE,KAAK,CAAC,MAAM,CAAC,qBAAqB,OAAO,0BAA0B,QAAQ,sBAAsB,kBAAkB,6BAA6B,MAAM,0BAA0B,MAAM,EAAE,SAAS,CAAcE,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,iCAAiC,CAAC,EAAE,2PAAsP,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,2BAA2B,CAAC,EAAE,gQAA2P,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,qCAAqC,CAAC,EAAE,mLAA8K,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAC,6NAA0OE,EAAEC,EAAE,CAAC,KAAK,4CAA4C,YAAY,GAAG,OAAO,YAAY,aAAa,GAAG,UAAU,CAAC,EAAE,QAAQ,oBAAoB,aAAa,GAAG,SAAsBD,EAAEE,EAAE,EAAE,CAAC,SAAS,6BAA6B,CAAC,CAAC,CAAC,EAAE,kBAAkB,CAAC,CAAC,EAAeF,EAAE,KAAK,CAAC,SAAS,4CAA4C,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2OAAsO,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,8EAA8E,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,oBAAoB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,yGAAyG,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,cAAc,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,yGAAyG,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,SAAS,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,qHAAqH,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,UAAU,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,iGAAiG,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,sBAAsB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,iGAAiG,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,qBAAqB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gJAAgJ,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,yBAAyB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2HAA2H,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,4BAA4B,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wHAAwH,CAAC,EAAeF,EAAE,IAAI,CAAC,SAAS,CAAC,aAA0BE,EAAEC,EAAE,CAAC,KAAK,0BAA0B,YAAY,GAAG,OAAO,YAAY,aAAa,GAAG,UAAU,CAAC,EAAE,QAAQ,oBAAoB,aAAa,GAAG,SAAsBD,EAAEE,EAAE,EAAE,CAAC,SAAS,MAAM,CAAC,CAAC,CAAC,EAAE,2LAA2L,CAAC,CAAC,EAAeF,EAAE,KAAK,CAAC,SAAS,eAAe,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,8ZAA8Z,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,yVAAyV,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sQAAiQ,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,8SAA8S,CAAC,CAAC,CAAC,CAAC,EAAeK,EAAuBP,EAAIC,EAAS,CAAC,SAAS,CAAcC,EAAE,IAAI,CAAC,SAAS,sTAAsT,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,8BAA8B,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,khBAA6gB,CAAC,EAAeA,EAAE,MAAM,CAAC,IAAI,+BAA+B,UAAU,eAAe,OAAO,MAAM,IAAI,wEAAwE,OAAO,yWAAyW,MAAM,CAAC,YAAY,aAAa,EAAE,MAAM,MAAM,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,isBAAosB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kpBAAkpB,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,iBAAiB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gaAAga,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,oCAAoC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2RAAsR,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0aAA0a,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,uBAAuB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mUAAmU,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gaAAka,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,2BAA2B,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,scAAic,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,8gBAA8gB,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,yCAAyC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,keAAke,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,oZAAoZ,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,8TAA8T,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,eAAe,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6eAA6e,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0lBAA2lB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,qoBAAsoB,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,iBAAiB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,qpBAAupB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,4cAA8c,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,uBAAuB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2lBAA2lB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mVAAmV,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kEAAkE,CAAC,EAAeF,EAAE,KAAK,CAAC,MAAM,CAAC,qBAAqB,OAAO,0BAA0B,QAAQ,sBAAsB,kBAAkB,6BAA6B,MAAM,0BAA0B,MAAM,EAAE,SAAS,CAAcE,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,oFAAoF,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,kEAAkE,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,gFAAgF,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gYAAiY,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wPAAwP,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,2EAA2E,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+TAA0T,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,qdAAqd,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,idAAmd,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,4BAA4B,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kVAAkV,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mUAAmU,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,ynBAAynB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0RAA0R,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,oBAAoB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,8GAA8G,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,oBAAoB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,uOAAuO,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,qBAAqB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,4VAA4V,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,iBAAiB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,maAAma,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,kBAAkB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+NAA+N,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,yBAAyB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kJAAkJ,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,8CAA8C,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,8fAA+f,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,iRAAiR,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,wBAAwB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,8jBAA8jB,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,6BAA6B,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,ycAAoc,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sYAAkY,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,aAAa,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,qaAAqa,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,obAA+a,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0NAA0N,CAAC,CAAC,CAAC,CAAC,EAAeM,EAAuBR,EAAIC,EAAS,CAAC,SAAS,CAAcC,EAAE,IAAI,CAAC,SAAS,yfAAyf,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0OAA0O,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2ZAA2Z,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,iZAAiZ,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,seAAse,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,8aAA8a,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,sCAAsC,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,+BAA+B,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sOAAsO,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,yUAAyU,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,iSAAiS,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,mCAAmC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2UAA2U,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,uYAAuY,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2WAA2W,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,8BAA8B,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,4QAA4Q,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+SAA+S,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gVAAgV,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,6CAA6C,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2WAA2W,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+PAA+P,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sRAAsR,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sZAAsZ,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,oaAAoa,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gVAAgV,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+UAA+U,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,yBAAyB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+QAA+Q,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wcAAwc,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,qWAAqW,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6OAA6O,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6bAA6b,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,qCAAqC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2LAA2L,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,kBAAkB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6UAA6U,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2UAA2U,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,mCAAmC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2TAA2T,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,wCAAwC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sMAAsM,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gJAAgJ,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,yOAAyO,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,8NAA8N,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,mBAAmB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,4QAA4Q,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,oBAAoB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mUAAmU,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,wCAAwC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,oMAAoM,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,gBAAgB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,oQAAoQ,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mVAAmV,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+UAA+U,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,YAAY,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+fAA+f,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kbAAkb,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wWAAwW,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,gBAAgB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mQAAmQ,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0WAA0W,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,qOAAqO,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,wCAAwC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,ySAAyS,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sLAAsL,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,4CAA4C,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6HAA6H,CAAC,EAAeF,EAAE,KAAK,CAAC,MAAM,CAAC,qBAAqB,OAAO,0BAA0B,QAAQ,sBAAsB,kBAAkB,6BAA6B,MAAM,0BAA0B,MAAM,EAAE,SAAS,CAAcE,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,sEAAsE,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,2CAA2C,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,8CAA8C,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,+DAA+D,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,gCAAgC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,mEAAmE,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gfAAgf,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+aAA+a,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,uOAAuO,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,YAAY,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,4TAA4T,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,imBAAimB,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,cAAc,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAsBA,EAAE,KAAK,CAAC,SAAS,4fAA4f,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeO,EAAuBT,EAAIC,EAAS,CAAC,SAAS,CAAcC,EAAE,IAAI,CAAC,SAAS,wXAAwX,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,uSAAuS,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,qCAAqC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0VAA0V,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wVAAwV,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,seAAse,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sQAAsQ,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,qBAAqB,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,0BAA0B,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,yQAAyQ,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mQAAmQ,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,oSAAoS,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,cAAc,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wXAAyX,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,uDAAuD,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0gBAAqgB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2OAA2O,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,uNAAuN,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sTAAsT,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,oaAAoa,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,iVAAiV,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gKAAgK,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,yLAAyL,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,kBAAkB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kSAAkS,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mIAAmI,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,4BAA4B,CAAC,EAAeF,EAAE,KAAK,CAAC,MAAM,CAAC,qBAAqB,OAAO,0BAA0B,QAAQ,sBAAsB,kBAAkB,6BAA6B,MAAM,0BAA0B,MAAM,EAAE,SAAS,CAAcE,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,uFAAuF,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,uGAAuG,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,kHAAkH,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBA,EAAE,IAAI,CAAC,SAAS,4DAA4D,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kMAAkM,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,iBAAiB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+NAA+N,CAAC,EAAeF,EAAE,KAAK,CAAC,MAAM,CAAC,qBAAqB,OAAO,0BAA0B,QAAQ,sBAAsB,kBAAkB,6BAA6B,MAAM,0BAA0B,MAAM,EAAE,SAAS,CAAcE,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,yCAAyC,CAAC,EAAE,kIAAkI,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,wBAAwB,CAAC,EAAE,kLAAkL,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,2BAA2B,CAAC,EAAE,sQAAsQ,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,8DAA8D,CAAC,EAAE,uHAAuH,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,8CAA8C,CAAC,EAAE,2EAA2E,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0TAA0T,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,yCAAyC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wHAAwH,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gSAAgS,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gYAAgY,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,uHAAuH,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,oSAAqS,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wUAAwU,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,uOAAuO,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,sDAAsD,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,mSAA+R,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0MAA0M,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0LAA0L,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,iUAAmU,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,iSAAiS,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sUAAuU,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,WAAW,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+LAA+L,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sNAAsN,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sQAAsQ,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,4IAA4I,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,gBAAgB,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kKAAkK,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,6UAAwU,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,0HAA0H,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,kCAAkC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,4TAA4T,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,wFAAwF,CAAC,EAAeF,EAAE,KAAK,CAAC,MAAM,CAAC,qBAAqB,OAAO,0BAA0B,QAAQ,sBAAsB,kBAAkB,6BAA6B,MAAM,0BAA0B,MAAM,EAAE,SAAS,CAAcE,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,eAAe,CAAC,EAAE,sGAAsG,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,gCAAgC,CAAC,EAAE,+HAA+H,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,sCAAsC,CAAC,EAAE,0IAA0I,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,cAAc,CAAC,EAAE,qKAAqK,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,gBAAgB,CAAC,EAAE,oNAAoN,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,2BAA2B,CAAC,EAAE,4IAA4I,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,uBAAuB,CAAC,EAAE,kKAAkK,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,KAAK,CAAC,kBAAkB,IAAI,SAAsBF,EAAE,IAAI,CAAC,SAAS,CAAcE,EAAE,KAAK,CAAC,SAAS,gBAAgB,CAAC,EAAE,6OAA6O,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,kOAAkO,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,uCAAuC,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,uJAAuJ,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,saAAsa,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,uZAAuZ,CAAC,EAAeA,EAAE,KAAK,CAAC,SAAS,OAAO,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,2NAA2N,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,sLAAsL,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,+NAA+N,CAAC,EAAeA,EAAE,IAAI,CAAC,SAAS,gJAAgJ,CAAC,CAAC,CAAC,CAAC,EACn4wIQ,EAAqB,CAAC,QAAU,CAAC,UAAY,CAAC,KAAO,WAAW,YAAc,CAAC,sBAAwB,GAAG,CAAC,EAAE,UAAY,CAAC,KAAO,WAAW,YAAc,CAAC,sBAAwB,GAAG,CAAC,EAAE,UAAY,CAAC,KAAO,WAAW,YAAc,CAAC,sBAAwB,GAAG,CAAC,EAAE,SAAW,CAAC,KAAO,WAAW,YAAc,CAAC,sBAAwB,GAAG,CAAC,EAAE,UAAY,CAAC,KAAO,WAAW,YAAc,CAAC,sBAAwB,GAAG,CAAC,EAAE,UAAY,CAAC,KAAO,WAAW,YAAc,CAAC,sBAAwB,GAAG,CAAC,EAAE,mBAAqB,CAAC,KAAO,UAAU,CAAC,CAAC",
  "names": ["richText", "u", "x", "p", "Link", "motion", "richText1", "richText2", "richText3", "richText4", "richText5", "__FramerMetadata__"]
}
