The newest generation of the popular Llama AI models is here with Llama 4 Scout and Llama 4 Maverick. Accelerated by NVIDIA open-source software, they can achieve over 40K output tokens per second on NVIDIA Blackwell B200 GPUs, and are available to try as NVIDIA NIM microservices. The Llama 4 models are now natively multimodal and multilingual using a mixture-of-experts (MoE) architecture.
]]>Large language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical for the current resource constraints that many companies have. The rise of small language models (SLMs) bridge quality and cost by creating models with a smaller resource footprint. SLMs are a subset of language models that tend to��
]]>The new release introduces Python support in Service Maker to accelerate real-time multimedia and AI inference applications with a powerful GStreamer abstraction layer.
]]>The exponential growth of visual data��ranging from images to PDFs to streaming videos��has made manual review and analysis virtually impossible. Organizations are struggling to transform this data into actionable insights at scale, leading to missed opportunities and increased risks. To solve this challenge, vision-language models (VLMs) are emerging as powerful tools��
]]>The NVIDIA Maxine AI developer platform is a suite of NVIDIA NIM microservices, cloud-accelerated microservices, and SDKs that offer state-of-the-art features for enhancing real-time video and audio. NVIDIA partners use Maxine features to create better virtual interaction experiences and improve human connections with their applications. Making and maintaining eye contact are rare in virtual��
]]>NVIDIA TAO is a framework designed to simplify and accelerate the development and deployment of AI models. It enables you to use pretrained models, fine-tune them with your own data, and optimize the models for specific use cases without needing deep AI expertise. TAO integrates seamlessly with the NVIDIA hardware and software ecosystem, providing tools for efficient AI model training��
]]>This post is the third in a series on building multi-camera tracking vision AI applications. We introduce the overall end-to-end workflow and fine-tuning process to enhance system accuracy in the first part and second part. NVIDIA Metropolis is an application framework and set of developer tools that leverages AI for visual data analysis across industries. Its multi-camera tracking reference��
]]>Effective video communication is important for everyone who communicates online. For businesses, educators, and content creators, it is vital. NVIDIA Maxine is a suite of NVIDIA-accelerated SDKs, cloud-native containerized NVIDIA NIM microservices for deploying AI features that enhance real-time audio and video for video conferencing, digital humans, virtual presence, and content creation.
]]>Maritime startup Orca AI is pioneering safety at sea with its AI-powered navigation system, which provides real-time video processing to help crews make data-driven decisions in congested waters and low-visibility conditions. Every year, thousands of massive 100-million-pound vessels, ferrying $14T worth of goods, cross the world��s oceans and waterways, fighting to keep to tight deadlines.
]]>As of 3/18/25, NVIDIA Triton Inference Server is now NVIDIA Dynamo. Ever spotted someone in a photo wearing a cool shirt or some unique apparel and wondered where they got it? How much did it cost? Maybe you��ve even thought about buying one for yourself. This challenge inspired Snap��s ML engineering team to introduce Screenshop, a service within Snapchat��s app that uses AI to locate��
]]>NVIDIA DeepStream is a powerful SDK that unlocks GPU-accelerated building blocks to build end-to-end vision AI pipelines. With more than 40+ plugins available off-the-shelf, you can deploy fully optimized pipelines with cutting-edge AI Inference, object tracking, and seamless integration with popular IoT message brokers such as REDIS, Kafka, and MQTT. DeepStream offers intuitive REST APIs to��
]]>NVIDIA Metropolis Microservices for Jetson has been renamed to Jetson Platform Services, and is now part of NVIDIA JetPack SDK 6.0. NVIDIA Metropolis Microservices for Jetson provides a suite of easy-to-deploy services that enable you to quickly build production-quality vision AI applications while using the latest AI approaches. This post explains how to develop and deploy generative AI��
]]>The rapid growth in the size, complexity, and diversity of large language models (LLMs) continues to drive an insatiable need for AI training performance. Delivering top performance requires the ability to train models at the scale of an entire data center efficiently. This is achieved through exceptional craftsmanship at every layer of the technology stack, spanning chips, systems, and software.
]]>NVIDIA and SparkFun invite developers to build innovative AI applications using the NVIDIA Jetson. Enter now.
]]>NVIDIA TAO Toolkit provides a low-code AI framework to accelerate vision AI model development suitable for all skill levels, from novice beginners to expert data scientists. With the TAO Toolkit, developers can use the power and efficiency of transfer learning to achieve state-of-the-art accuracy and production-class throughput in record time with adaptation and optimization.
]]>Learn how Vision Transformers are revolutionizing AI applications with image understanding and analysis.
]]>Most drone inspections still require a human to manually inspect the video for defects. Computer vision can help automate and accelerate this inspection process. However, training a computer vision model to automate inspection is difficult without a large pool of labeled data for every possible defect. In a recent session at NVIDIA GTC, we shared how Exelon is using synthetic data generation��
]]>According to the American Society of Quality (ASQ), defects cost manufacturers nearly 20% of overall sales revenue. The products that we interact with on a daily basis��like phones, cars, televisions, and computers��must be manufactured with precision so that they can deliver value in varying conditions and scenarios. AI-based computer vision applications are helping to catch defects in the��
]]>When you observe something over a period of time, you can find trends or patterns that enable predictions. With predictions, you can, for example, proactively alert yourself to take appropriate action. More specifically, when you observe moving objects, the trajectory is one of the most important ways to understand the target object behavior, through which you can gain actionable insights��
]]>Robots are increasing in complexity, with a higher degree of autonomy, a greater number and diversity of sensors, and more sensor fusion-based algorithms. Hardware acceleration is essential to run these increasingly complex workloads, enabling robotics applications that can run larger workloads with more speed and power efficiency. The mission of NVIDIA Isaac ROS has always been to empower��
]]>Over a billion cameras are deployed in the most important spaces worldwide and these cameras are critical sources of video and data. It is becoming increasingly important to understand how to harness this data to make our spaces and processes more efficient and safer. Lumeo, an NVIDIA Metropolis partner, provides a ��no-code�� video analytics platform that enables developers and solution��
]]>The Dataiku platform for everyday AI simplifies deep learning. Use cases are far-reaching, from image classification to object detection and natural language processing (NLP). Dataiku helps you with labeling, model training, explainability, model deployment, and centralized management of code and code environments. This post dives into high-level Dataiku and NVIDIA integrations for image��
]]>Vision AI-powered applications are exploding in terms of value and adoption across industries. They��re being developed both by sophisticated AI developers and those totally new to AI. Both types of developers are being challenged with more complex solution requirements and faster time to market. Building these vision AI solutions requires a scalable, distributed architecture and tools that��
]]>Detecting objects in high-resolution input is a well-known problem in computer vision. When a certain area of the frame is of interest, inference over the complete frame is unnecessary. There are two ways to solve this issue: In many ways, the first approach is difficult. Training a model with large input often requires larger backbones, making the overall model bulkier.
]]>MLPerf benchmarks are developed by a consortium of AI leaders across industry, academia, and research labs, with the aim of providing standardized, fair, and useful measures of deep learning performance. MLPerf training focuses on measuring time to train a range of commonly used neural networks for the following tasks: Lower training times are important to speed time to deployment��
]]>Instance segmentation is a core visual recognition problem for detecting and segmenting objects. In the past several years, this area has been one of the holy grails in the computer vision community with wide applications ranging from autonomous vehicles (AV), robotics, video analysis, smart home, digital human, and healthcare. Annotation, the process of classifying every object in an image��
]]>This post was written to enable the beginner developer community, especially those new to computer vision and computer science. NVIDIA recognizes that solving and benefiting the world��s visual computing challenges through computer vision and artificial intelligence requires all of us. NVIDIA is excited to partner and dedicate this post to the Black Women in Artificial Intelligence.
]]>AI applications are powered by machine learning models that are trained to predict outcomes accurately based on input data such as images, text, or audio. Training a machine learning model from scratch requires vast amounts of data and a considerable amount of human expertise, often making the process too expensive and time-consuming for most organizations. Transfer learning is the happy��
]]>This guest post was submitted by Drishtic AI Lead Developer, Priti Gavali and Technical Architect, Archana Borawake. The fashion industry is seeing many changes in terms of new technologies and evolving consumer trends. As one of the fastest-growing sectors in retail, the fashion industry is using data to better understand consumer��s clothing tastes and preferences. Drishtic AI��s solution��
]]>See the latest AI-vision advancements in developer tools, accelerated research, smart spaces, and deploying AI at the edge. With innovation happening across many industries, don��t miss all the exciting use cases and discoveries that will be presented at this GTC. NVIDIA GTC runs from November 8-11, with a focus on all things computer vision��including Intelligent Video Analytics��
]]>This post discusses tensor methods, how they are used in NVIDIA, and how they are central to the next generation of AI algorithms. Tensors, which generalize matrices to more than two dimensions, are everywhere in modern machine learning. From deep neural networks features to videos or fMRI data, the structure in these higher-order tensors is often crucial.
]]>Facebook AI researchers this week announced SEER, a self-supervised model that surpasses the best self-supervised systems, and also outperforms supervised models on tasks including image classification, object detection, and segmentation. Combining RegNet architectures with the SwAV online clustering approach, SEER is a billion-parameter model pretrained on a billion random images.
]]>��Meet the Researcher�� is a monthly series in which we spotlight different researchers in academia who are using NVIDIA technologies to accelerate their work. This month, we spotlight Lorenzo Baraldi, Assistant Professor at the University of Modena and Reggio Emilia in Italy. Before working as a professor, Baraldi was a research intern at Facebook AI Research. He serves as an Associate Editor��
]]>Dan Jia, Alexander Hermans, and Bastian Leibe of RWTH Aachen University were awarded the Jetson Project of the Month for DR-SPAAM Detector. The Distance Robust Spatial Attention and Auto-regressive Model (DR-SPAAM), which runs on a Jetson AGX Xavier, is a deep learning model for detecting persons in 2D range sequences from a laser scanner. Sensors such as RGB(-D) cameras and lidars or a��
]]>Lidars and cameras alone aren��t enough to put self-driving into action. Sensor diversity is a cornerstone of autonomous driving. However, it only works if every sensor is in alignment. The NVIDIA DriveWorks SDK makes it possible to perform sensor calibration both offline, before the vehicle hits the road, and while the vehicle is driving with self-calibration. Developers can learn how to use��
]]>In episode two of the Grandmaster Series, learn how participating members of the Kaggle Grandmasters of NVIDIA (KGMON) built large-scale image classification models to win the Google Landmark Recognition 2020 Kaggle competition. In this landmark recognition challenge, the team had to build models that recognize the correct landmark (if any) in a dataset of complicated test images.
]]>NASA scientists and collaborators have achieved a supercomputing and AI breakthrough, a deep learning model that has the potential, with some limitations, to map the location and size of every tree worldwide. The new model, described in a new Nature paper, lays the foundation for a more accurate global measure of carbon storage on land. In this particular study, the team counted over 1.8��
]]>Advanced video analysis solutions are in great demand across multiple industries. Some of the popular use-cases include retail aisles to understand customer brand sentiment, occupancy analytics for crowd management in mass transit locations, optimizing vehicle traffic patterns in cities, hospitals, and malls for social distancing protocols, and defect detection in manufacturing facilities.
]]>Agustinus (Gus) Nalwan was awarded the Jetson Project of the Month for his interactive AI bot, Qrio. This bot, running on the NVIDIA Jetson Nano, can ask for a toy, identify and state its name and play videos related to it. Gus was inspired by the curiosity of his toddler and fondly named the bot, Qrio (a clever combination of the words: ��question�� and ��curiosity��). The functionality of the��
]]>CVPR is one of the main conferences which provide researchers and engineers with the opportunity to meet and discuss their amazing work. This year, with CVPR and other conferences going virtual, we take the opportunity to recognize our academic and Inception industry partners�� work at CVPR 2020 through this post. Here��s one paper from UC Berkeley researchers: While machine learning at the edge��
]]>GPU parallel computing is delivering high performance to autonomous vehicle evaluation. In a research paper presented at the Computer Vision and Pattern Recognition Conference (CVPR) this week, NVIDIA GPUs were found to drastically reduce the time it takes to evaluate perception models using a new sophisticated evaluation metric, named Planning Kullback-Leibler Divergence (PKL).
]]>Researchers, developers, and engineers from all over the world are gathering virtually this year for the 2020 Conference on Computer Vision and Pattern Recognition (CVPR). NVIDIA Research will present its research through oral presentations, posters, and interactive Q&As. NVIDIA��s accepted papers at this year��s online CVPR feature a range of groundbreaking research in the ?eld of computer��
]]>Forty years to the day since PAC-MAN first hit arcades in Japan, and went on to munch a path to global stardom, the retro classic has been reborn, delivered courtesy of AI. Trained on 50,000 episodes of the game, a powerful new AI model created by NVIDIA Research, called NVIDIA GameGAN, can generate a fully functional version of PAC-MAN �� without an underlying game engine. That means that even��
]]>Facebook this week announced a GPU-accelerated model designed for shopping. The model uses AI to automatically identify consumer goods from images to help make them shoppable. GrokNet, a universal computer vision system, can identify items in categories such as fashion, auto, and home decor. The model is in production today and is available for buyers and sellers in Facebook Marketplace.
]]>NVIDIA today announced new AI models to help the medical community better track, test and treat COVID-19. Available today, AI models developed jointly with the National Institutes of Health (NIH) can help researchers study the severity of COVID-19 from chest CT scans and develop new tools to better understand, measure and detect infections. The models are immediately available in the��
]]>Digitizing millions of historical documents and newspapers is a challenging task. To help speed up the process, the U.S. Library of Congress developed a GPU-accelerated, deep learning model to automatically extract, categorize, and caption over 16 million pages of historic American newspapers published between 1789 and 1963. The work, which is being made publicly available for��
]]>Social distancing is one of the most important defenses against the spread of COVID-19. The team at Galliot was awarded the Jetson Project of the Month for their ��Smart Social Distancing with AI�� application. This open-source application based on Jetson Nano helps businesses monitor social distancing practices on their premises and take corrective action in real time. The application��s main��
]]>AI developers, data scientists and companies building intelligent video analytics apps face significant challenges in creating and deploying highly accurate AI. Some of the key issues include dataset collection and labeling, achieving high accuracy with available dataset, deploying on legacy infrastructure, and scalability of apps and services. With billions of cameras and sensors deployed��
]]>Sweden-based Mapillary, a premier member of NVIDIA Inception, uses deep learning to automate mapping. Their platform provides street-level mapping by stitching together images sourced from its community of individual contributors, companies, and governments. NVIDIA Inception is a startup accelerator. Yesterday, the company announced the release of a new product, the Mapillary Street-Level��
]]>At the epicenter of the Coronavirus in Wuhan China, a team of physicians in China are using GPU-accelerated AI software to detect visual signs of the coronavirus (Covid 19). Physicians there say the AI-based software, which relies on NVIDIA GPUs for both training and inference, has helped overworked staff screen patients and prioritize those likely to have the virus. The software��
]]>To help autonomous machines better sense transparent objects, Google researchers, in collaboration with Columbia University and Synthesis AI, developed ClearGrasp, an algorithm that can accurately estimate the 3D data of clear objects, like a glass container or plastic utensil, from standard RGB images. ��Enabling machines to better sense transparent surfaces would not only improve safety but��
]]>To help you get up-and-running with deep learning and inference on NVIDIA��s Jetson platform, today we are releasing a new video series named Hello AI World to help you get started. In the first episode Dustin Franklin, Developer Evangelist on the Jetson team at NVIDIA, shows you how to perform real-time object detection on the Jetson Nano. In this hands-on tutorial, you��ll learn how��
]]>To help autonomous vehicles and robots potentially spot objects that lie just outside a system��s direct line-of-sight, Stanford, Princeton, Rice, and Southern Methodist universities researchers developed a deep learning-based system that can detect objects, including words and symbols, around corners. ��Compared to other approaches, our non-line-of-sight imaging system provides uniquely high��
]]>With as many as 2 billion parking spaces in the United States, finding an open spot in a major city can be complicated. To help city planners and drivers more efficiently manage and find open spaces, MIT researchers developed a deep learning-based system that can automatically detect open spots from a video feed. ��Parking spaces are costly to build, parking payments are difficult to enforce��
]]>To help neurosurgeons diagnose brain tumors more efficiently, researchers from the University of Michigan developed a deep learning-based imaging technique that can reduce the tumor diagnosis process during surgery from 30-40 minutes to less than three minutes. First unveiled in 2017, the technique called stimulated Rama histology (SRH) helps neurosurgeons more rapidly assess tumor tissue in��
]]>By: Xiaolin Lin, Yilin Yang Editor��s note: This is the latest post in our NVIDIA DRIVE Labs series, which takes an engineering-focused look at individual autonomous vehicle challenges and how NVIDIA DRIVE addresses them. Catch up on all of our automotive posts, here. Lane and road edge detection is critical for self-driving car development �� lane detection powers systems like lane��
]]>To help accelerate microscopic systems, researchers from the Salk Institute, the University of Texas at Austin, fast.AI, and others, developed a new AI-based microscopy approach that has the potential to make microscopic techniques used for brain imaging 16 times faster. ��Point scanning imaging systems are perhaps the most widely used tools for high-resolution cellular and tissue imaging,����
]]>By JC Li Editor��s note: This is the latest post in our NVIDIA DRIVE Labs series, which takes an engineering-focused look at individual autonomous vehicle challenges and how NVIDIA DRIVE addresses them. Catch up on all automotive posts. AI can now make it easier for cars to see in the dark, while ensuring other vehicles won��t be blinded by the light. High beam lights can increase the��
]]>There��s a tremendous opportunity to bring efficiency in our cities, in retail operations, manufacturing lines, shipping and routing in warehouses. The groundwork has already been laid out with billions of sensors and cameras installed worldwide, that are rich sources of data. Yet the ability to extract insights from this information has been challenging, and today��s solutions are siloed for��
]]>By Jason Phang, PhD student at the NYU Center for Data Science Breast cancer is the second leading cancer-related cause of death among women in the US. However, screening mammograms require radiologists to arduously pore over extremely high-resolution mammography images, looking for features suggestive of cancerous or suspicious lesions. This appears like an ideal situation to apply deep��
]]>For the first time using AI, researchers from Yamagata University in Japan, in collaboration with IBM, have discovered 142 new geoglyphs, which depict people, animals, and other beings, on the ancient motifs in the Nazca Pampa region of Peru. The new geoglyphs were identified using high-resolution 3D data, taken from on-site surveys and aerial imagery collected since 2004.
]]>Using GANs, this deep learning-based system can act as a personal fashion designer by recommending changes that can make a person��s outfit more fashionable. Using large portions of NVIDIA��s Pix2PixHD code, Facebook AI researchers in collaboration with UT Austin, Cornell University, and Georgia Tech developedFashion++, a deep learning-based model that uses GANs to offer suggestions on what to��
]]>Generating road layout for different city styles is a time-consuming task. Artists need to manually create the road geometry and adjust its parameters like width and curvature by comparing it with real-world maps. But, what if there was a better way to do this? At the International Conference on Computer Vision in Seoul, Korea, NVIDIA researchers, in collaboration with University of Toronto��
]]>To help advance medical research while preserving data privacy and improving patient outcomes for brain tumor identification, NVIDIA researchers in collaboration with King��s College London researchers today announced the introduction of the first privacy-preserving federated learning system for medical image analysis. NVIDIA is working with King��s College London and French startup Owkin to enable��
]]>Every week we bring you the top NVIDIA updates and stories for developers. In this week��s edition of our top 5 videos, we highlight a new GPU-accelerated supercomputer at MIT, a new Jetson-based drone, a GAN for fashion, and the brand new RTX Broadcast Engines SDK. Watch below: Redwood City, California-based Skydio and member of NVIDIA��s startup accelerator, Inception��
]]>Redwood City, California-based Skydio and member of NVIDIA��s startup accelerator, Inception, has just released the latest version of their AI capable GPU-accelerated drone, Skydio 2. Comprised of six 4K cameras, with an NVIDIA Jetson TX2 as the processor for the autonomous system, Skydio 2 is capable of flying for up to 23 minutes at a time and can be piloted by either an experienced pilot��
]]>This summer, student interns at Booz Allen Hamilton bested the competition on edge computing with the help of NVIDIA Jetson Nano. The Booz Allen Summer Games Challenge (SGC) calls on student interns across the U.S. to develop breakthrough solutions for its clients�� most pressing problems. This summer, Project RAZOR placed top 10 among artificial intelligence and machine learning projects with��
]]>How much dark matter is there in the universe? This AI model might have the answer. A team of physicists and computer scientists at ETH Zurich developed a deep learning-based model to estimate the amount of dark matter in the universe. This is the first time AI researchers have used this type of algorithm to analyze dark matter, the researchers said. As a first step, the team trained a��
]]>As speech recognition applications become mainstream and get deployed through devices in the home, car, and office, research from academia and industry for this space has exploded. To present their latest work, global AI leaders and developers, including NVIDIA researchers, will come together in Austria next week to discuss the latest automatic speech recognition (ASR) breakthroughs��
]]>To help monitor traffic conditions, locate missing vehicles, and possibly help find lost children, University of Toronto and Northeastern University researchers developed a deep learning-based vehicle image search engine that can be used to narrow down the location of a vehicle. ��We developed the Vehicle Image Search Engine (VISE) to support the fast search of highly similar vehicles given an��
]]>To help with animal conservation efforts, University of Oxford researchers developed a deep learning-based model that can identify individual chimpanzees with 93% accuracy and correctly classify their sex with 96% accuracy. ��Automating the process of individual identification could represent a step change in our use of large image databases from the wild to open up vast amounts of data��
]]>By: Berta Rodriguez Hervas Editor��s note: This is the latest post in our NVIDIA DRIVE Labs series, which takes an engineering-focused look at individual autonomous vehicle challenges and how NVIDIA DRIVE addresses them. Catch up on all of our automotive posts. One of the first lessons in driving is simple: Green means go, red means stop. Self-driving cars must learn the same principles��
]]>Looking to create a cartoon emoji version of yourself? This new AI app can help. A new deep learning-based messenger app can turn users into personalized cartoon characters. ��Each avatar is unique and personalized to its user,�� the company explained. ��In this app you can be anyone you want: a human digital copy of yourself, a unicorn, a rabbit or even a tiger!
]]>The Oregon State University College of Engineering has just purchased six NVIDIA DGX-2 systems to help accelerate their work in AI, robotics, driverless vehicles, and other research areas that require powerful GPU compute. ��The computing power we now possess will accelerate our research in artificial intelligence and machine learning, while exposing our computer science students to the most��
]]>Amber is a suite of biomolecular simulation programs that began in the late 1970��s and is maintained by an active development community. It is a particle simulation code for looking at how molecules move under Newtonian approximations. The Amber software suite is divided into two parts: AmberTools18, the most recent version of a collection of freely available programs mostly under the GPL��
]]>University of California Irvine researchers recently developed a deep neural network that can solve a Rubik��s cube in a fraction of a second with 100 percent accuracy in all testing configurations. ��Artificial intelligence can defeat the world��s best human chess and Go players, but some of the more difficult puzzles, such as the Rubik��s Cube, had not been solved by computers��
]]>According to the American Cancer Society, prostate cancer is the second most common cancer in American men, averaging around 175,000 new cases every year. During the diagnosis process, more than one million men in the U.S. alone undergo a prostate biopsy, a procedure that results in 10-12 needle cores for patients, and more than 10 million tissue samples that need to be examined by pathologists.
]]>The Square Kilometre Array (SKA) project is an effort to build the world��s largest radio telescope, with a collecting area of over one square kilometre. The design and development of the SKA is a truly global effort involving 100 organizations across 20 countries. The SKA is one of the largest scientific endeavors in history and its scale represents a huge leap forward in both engineering and��
]]>By: Abhishek Bajpayee This is the latest post in our NVIDIA DRIVE Labs series, which takes an engineering-focused look at individual autonomous vehicle challenges and how NVIDIA DRIVE addresses them. Catch up on all automotive posts. Autonomous vehicles rely on cameras to see their surrounding environment. However, environmental factors such as rain, snow, or other blockages can affect��
]]>Santa Monica, California-based startup Pearl, a new healthcare company focused on the dental industry, has just raised $11 million in series A funding to create a holistic oral health platform. ��Pearl will have an immediate positive impact on the dental category,�� Ophir Tanz, the company��s CEO told VentureBeat. ��It will streamline tedious, repetitive tasks��
]]>To help potential homebuyers get a 360-degree tour of a home, Zillow, the online real estate database company, recently launched a new app and service across North America that relies on machine learning to generate 3D walkthroughs of a home. ��Previously, 3D tours were only found on high-end or expensive homes, due to the high cost and time-intensive capture process, said Josh Weisberg��
]]>By Sandra Skaff This week top AI researchers are gathered in New Orleans, LA, to present their cutting edge research. Our NVAIL partners are among the researchers presenting this work. We highlight here the work of three of these partners, which has been developed with robotics as a target application. Mila researchers are presenting BabyAI, which is a platform for learning to��
]]>Every week we highlight NVIDIA��s Top 5 AI stories of the week. In this week��s edition we cover a new deep learning-based algorithm from OpenAI that can automatically generate new music. Plus, an automatic speech recognition model that could improve Alexa��s algorithm by 15%. Watch below: Planning a workout that is specific to a user��s needs can be challenging.
]]>According to the World Health Organization (WHO), there are an estimated 360 million people worldwide with disabling hearing loss. To help with sign language translation, Researchers from Michigan State University developed a deep learning-based system that can automatically interpret individual signs of the American Sign Language (ASL) as well as translate full ASL sentences without needing��
]]>Every week we highlight NVIDIA��s Top 5 AI stories of the week. In this week��s edition we highlight a new deep learning-based algorithm that is working towards replicating the energy of the sun on earth. Plus, a robot that can efficiently interact with liquid and moldable items. Watch below: This week at TechCrunch��s Robotics + AI event at UC Berkeley, NVIDIA��s VP of��
]]>Imagine a robot that can efficiently model clay, push ice cream onto a cone, or mold the rice for your sushi roll. MIT researchers developed a deep learning-based algorithm that improves a robot��s ability to mold materials into shapes, as well as enabling it to interact with liquids and solid objects. The work draws on inspiration from how humans interact with different objects.
]]>New York City-based startup TheTake, a member of the NVIDIA Inception program, recently unveiled a new deep learning-based algorithm that can automatically decode what a celebrity, athlete, or other public figure is wearing in a video in near real time. ��TheTake��s mission is simple: making media content shoppable,�� said Jared Browarnik, the company��s Co-founder and Chief Technology Officer.
]]>According to the American Cancer Society, more than 229,000 people will be diagnosed with lung cancer in the United States this year, with adenocarcinoma being the most common type. To help with diagnosis, researchers from Dartmouth��s Norris Cotton Cancer Center and the Hassanpour Lab at Dartmouth University developed a deep learning-based system for automated classification of histologic subtypes��
]]>Wondering how AI can inspire artists to create their best work? Renowned artist Chris Peters recently purchased a new NVIDIA TITAN RTX GPU with the intention of using it to create art. The results are stunning compositions generated by the AI, and actual oil paintings painted by Peters himself. ��The AI Muse produces digital images, but a digital image is not a painting and a computer printout��
]]>Editors note: The story below is a guest post written by current and former postgraduate students at the University of Oxford, a member of the NVIDIA AI Labs (NVAIL) program. If you follow science news, you��ve probably heard about the latest machine-over-man triumph by DeepMind. This time, the new AlphaStar algorithm was able to defeat a professional in the popular competitive strategy game��
]]>Researchers from Fujitsu just announced a new speed record for training ImageNet to 75% accuracy in 74.7 seconds. The new record is faster than the previous test by more than 47 seconds achieved by Sony in November of last year. The team achieved the record by using 2,048 NVIDIA Tesla V100 GPUs, and the MXNet deep learning framework, at the AI Bridging Cloud Infrastructure system at the��
]]>At GTC Silicon Valley in San Jose, NVIDIA released the latest version of TensorRT 5.1 which includes 20+ new operators and layers, integration with Tensorflow 2.0, ONNX Runtime, and TensorRT Inference Server 1.0. TensorRT 5.1 includes support for 20+ new Tensorflow and ONNX operations, ability to update model weights in engines quickly, and a new padding mode to match native framework��
]]>In this week��s video, see a robot that can recognize objects from touch, an AI tool that can automatically detect open parking spots, a new technology to generate lifelike avatars, and much more. Plus, see how GIPHY used AI and NVIDIA GPUs to develop an open source celebrity detection model. Watch below: Drawing inspiration from how humans interact with objects through touch��
]]>Trying to figure out who is in that celebrity GIF? This AI-based algorithm can help. Researchers from GIPHY, the online search database for GIFs, recently developed an open source deep learning model that can recognize over 2,300 celebrity faces with high accuracy. ��We needed a tool that could find and annotate this content within our ever-growing library of GIFs, so that this content could��
]]>Drawing inspiration from how humans interact with objects through touch, University of California, Berkeley researchers developed a deep learning-based perception framework that can recognize over 98 different objects from touch. According to the team, this is the first project that addresses this type of robot-object interaction using only touch at a large-scale. ��When we see a soft toy��
]]>From an AI algorithm that can predict earthquakes to a system that can decode rodent chatter �C here are the top 5 AI stories of the week. Most people can��t detect an earthquake until the ground under their feet is already shaking or sliding, leaving little time to prepare or take shelter. Scientists are trying to short circuit that surprise using the critical time window during the��
]]>University of Michigan researchers recently published a paper describing a new deep learning based-algorithm that can predict the future location of a pedestrian, along with their pose and gait. ��The proposed network is able to predict poses and global locations for multiple pedestrians simultaneously for pedestrians up to 45 meters from the cameras,�� the researchers stated in their paper.
]]>According to UNICEF, 1.2 million children are trafficked every year. To help identify and rescue victims of trafficking, researchers from George Washington University, Adobe, and Temple University released a new dataset called Hotels-50K and developed an AI-based algorithm that can be used to identify possible locations of where children are being held. ��Recognizing a hotel from an image of a��
]]>In 2016 through 2017 U.S. beekeepers lost 33 percent of bees. Many factors contribute to the death of bees, but the number one culprit is the Varroa mite. Manually detecting the mites is tedious and prone to human error. Researchers from EPFL, a research institution and university in Lausanne, Switzerland, developed a deep learning-based app that can automatically count the number of varroa mites��
]]>