“Scale is growing. Grow with us.”
|Founded: June 2016, San Francisco, US
Category: Artificial intelligence/Machine learning
Primary office:San Francisco, USA
Core technical team: San Francisco, USA
Employees: 97 + 30,000 contractors scattered across the globe aiding in the object-identification and labelling process.
Amount raised:USD$298.1million(8 rounds – Sept 2020
Data annotation products to support an increasing range of data inputs and annotation types for computer vision and natural language (NLP) applications.
- Scale 3D Sensor Fusion; The advanced annotation platform for 3D sensor, LiDAR and RADAR data.
- Scale Image; Comprehensive annotation for images
- Scale Text; Sophisticated annotation for text-based data
- Scale document; Secure processing of document
- Scale video; Scalable annotation for video data
- Scale Audio
- Scale Nucleus; Nucleus is a new way, the right way, to develop ML models, helping to move away from the concept of one dataset and towards a paradigm of collections of scenarios
- Partnerships and individual investments including Dropbox founder Drew Houston, Twitch founder, OpenAI, Quora
- Community: Toyota, Voyage, Embark, Lyft, Open AI, Skydo, Skip, Sea Machines, Standard cognition, Pinterest, SAP, Samsung, nuro, doordash, NVIDIA, Honda, Airbnb, Valeo, APTIV.
- Big data, cloud, AI/ML
- Highlights machine learning’s intimate bond between human contractors and algorithms. The “human insight” can help minimize labeling bias and provide customers data that is more precise and more accurate,
- Technology: Developer of a platform designed to accelerate the development of AI applications. Company’s software is more advanced and is able to label data faster and cheaper than the current alternatives.
- Offering solutions to various industries such as, Self-driving cars, Drones, Robotics, AR/VR and Retail
Distinct AI Features
Data labeling is not only practically important, it is also philosophically important to the field. Machine learning is a form of metaprogramming—the developer doesn’t directly write the program; the developer writes a program which itself writes the program. The developer provides a rough framework for what the program should look like (usually a neural network), and what its goal should be (usually a labeled dataset), and that spits out a program that is nonsensical to humans, but is better than any program a human could ever write.
- Scale accelerates the development of AI by democratizing access to intelligent data. By leveraging its API for autonomous vehicles and other use cases, companies like Alphabet, Voyage, nuTonomy, Embark, DriveAI and others, leverage Scale to turn raw information into human-labeled training data that dependably powers their AI applications.
- Scale uses a combination of high-quality human task work, smart tools, statistical confidence checks and machine learning to consistently return scalable, precise data. Scale AI turns raw data into high-quality training data by combining machine learning powered pre-labeling and active tooling with varying levels and types of human review.
AI useRate of return on customer’s investment to make AI work
- Addressed the need for large volumes of annotated data with a commitment to safety being non-negotiable in self-driving industry
- The ability to go to Scale for multi-modal annotation provides advantages in the automated driving space, including scenarios they don’t foresee today.
- Immediate:Scaling ground-truth data with Scale AI and classifying large volumes of images to develop an autonomous checkout system.
- Long term: Experience the future of retail, by developing an autonomous checkout platform for brick and mortar retailers that can change how people shop in the future.s
- An open-source data set called PandaSet that can be used for training machine learning models for autonomous driving with 48,000 camera images and 16,000 lidar sweeps and more than 100 scenes of 8s each.
- ImageNet a repository of 14 million labeled images in more than 20,000 categories.
- A carefully trained machine learning algorithm can process very large data sets with enormous efficiency. One branch of machine learning is known as a convolutional neural network (CNN) – an extremely powerful tool for image recognition and classification problems. A quantum computer could develop AI-based digital assistants with true contextual awareness and the ability to fully understand interactions with customers. That is because quantum computers have the potential to sort through a vast number of possibilities within a fraction of a second to come up with a probable solution.
- Data set:Labeled data is the key bottleneck to the growth of the machine learning industry. In fact, labeled data is even more essential than algorithms. ImageNet is a repository of 14 million labeled images in more than 20,000 categories.
- Innovation and reputation
- Detailed labeling for companies’ old data via point cloud segmentation in self-driving car industry: using 3D maps of the environment around a vehicle to encode what every point corresponds to (pedestrian, stop sign, window, shrub, and stroller).
- The team is also encoding the behavior of drivers, pedestrians, and cyclists with technology including “gaze detection,” which aims to indicate whether a driver might yield or a pedestrian plan to cross the street.
- Building the Future of Autonomous Vehicles
- Collecting and open sourcing labeled data
- Labeling data for companies, allowing them to identify blind spots and biases
- Roozbeh Abbasi
- Damilola Balogun