Deep Learning offentlig
[search 0]
Mer
Download the App!
show episodes
 
Most learning is superficial and fades quickly. This podcast will equip you to move to learning that is durable because it is deep. Deep learning lasts because it respects the way the brain works. Inquiring minds want to know "how" and "why"—not just what!
  continue reading
 
Artwork
 
Гости подкаста — специалисты в разных сферах AI. С ними мы обсуждаем профессию AI Researcher, карьерный рост и собеседования, а также исследования в разных областях, от фундаментального AI до медицины и квантовых компьютеров.
  continue reading
 
Welcome to The Deep Learning Crowd Podcast. We talk about individual journey's in the world of AI and discuss topics around Deep Learning. We discover first hand from our guests about some of the most interesting applications their companies are using right now.
  continue reading
 
Find me on Github/Twitter/Kaggle @SamDeepLearning. Find me on LinkedIn @SamPutnam. This Podcast is supported by Enterprise Deep Learning | Cambridge/Boston | New York City | Hanover, NH | http://www.EnterpriseDeepLearning.com. Contact: Sam@EDeepLearning.com, 802-299-1240, P.O. Box 863, Hanover, NH, USA, 03755. We move deep learning to production. I teach the worldwide Deploying Deep Learning Masterclass at http://www.DeepLearningConf.com in NYC regularly and am a Deep Learning Consultant ser ...
  continue reading
 
Most AI research today is done by elite universities and corporate labs. The pursuit of science and creativity should be a viable project for any person, from any walk of life, who is excited by that feeling of mystery and building something that grows. chloe is an end to end neural network chatbot written in PyTorch based on the transformer. Accomplishing goals through conversation is a task we can relate to, chatbots are an ideal agent through which to connect new research to our current u ...
  continue reading
 
Loading …
show series
 
Introducing a novel transformer architecture, Differential Transformer, designed to improve the performance of large language models. The key innovation lies in its differential attention mechanism, which calculates attention scores as the difference between two separate softmax attention maps. This subtraction effectively cancels out irrelevant co…
  continue reading
 
Introducing, ScienceAgentBench, a new benchmark for evaluating language agents designed to automate scientific discovery. The benchmark comprises 102 tasks extracted from 44 peer-reviewed publications across four disciplines, encompassing essential tasks in a data-driven scientific workflow such as model development, data analysis, and visualizatio…
  continue reading
 
Both sources explain neural network pruning techniques in PyTorch. The first source, "How to Prune Neural Networks with PyTorch," provides a general overview of the pruning concept and its various methods, along with practical examples of how to implement different pruning techniques using PyTorch's built-in functions. The second source, "Pruning T…
  continue reading
 
The source is a chapter from the book "Dive into Deep Learning" that explores the historical development of deep convolutional neural networks (CNNs), focusing on the foundational AlexNet architecture. The authors explain the challenges faced in training CNNs before the advent of AlexNet, including limited computing power, small datasets, and lack …
  continue reading
 
This text is an excerpt from the "Dive into Deep Learning" book, specifically focusing on the processing of sequential data. The authors introduce the challenges of working with data that occurs in a specific order, like time series or text, and how these sequences cannot be treated as independent observations. They delve into autoregressive models…
  continue reading
 
This excerpt from "Mental Models," a chapter in the "People + AI Guidebook," focuses on the importance of understanding and managing user mental models when designing AI-powered products. The authors discuss how to set expectations for adaptation, onboard users in stages, plan for co-learning, and account for user expectations of human-like interac…
  continue reading
 
This excerpt from Hugging Face's NLP course provides a comprehensive overview of tokenization techniques used in natural language processing. Tokenizers are essential tools for transforming raw text into numerical data that machine learning models can understand. The text explores various tokenization methods, including word-based, character-based,…
  continue reading
 
This research paper examines the efficiency of two popular deep learning libraries, TensorFlow and PyTorch, in developing convolutional neural networks. The authors aim to determine if the choice of library impacts the overall performance of the system during training and design. They evaluate both libraries using six criteria: user-friendliness, a…
  continue reading
 
This document provides a comprehensive set of rules for building and deploying machine learning systems, focusing on best practices gleaned from Google’s extensive experience. The document is divided into sections that cover the key stages of the machine learning process, including launching a product without ML, designing and implementing metrics,…
  continue reading
 
The research paper "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" explores a novel approach to language modeling by combining State Space Models (SSMs), which offer linear-time inference and strong performance in long-context tasks, with Mixture of Experts (MoE), a technique that scales model parameters while minimizing…
  continue reading
 
We discuss how to build Agentic Retrieval Augmented Generation (RAG) systems, which use AI agents to retrieve information from various sources to answer user queries. The author details the challenges he faced when building an Agentic RAG system to answer customer support questions, and provides insights into techniques like prompt engineering and …
  continue reading
 
Let's get RE(a)L, U! This research paper explores the impact of different activation functions, specifically ReLU and L-ReLU, on the performance of deep learning models. The authors investigate how the choice of activation function, along with factors like the number of parameters and the shape of the model architecture, influence model accuracy ac…
  continue reading
 
This lecture from Stanford University's CS229 course, "Machine Learning," focuses on the theory and practice of linear regression and gradient descent, two fundamental machine learning algorithms. The lecture begins by motivating linear regression as a simple supervised learning algorithm for regression problems where the goal is to predict a conti…
  continue reading
 
This video discusses the vanishing gradient problem, a significant challenge in training deep neural networks. The speaker explains how, as a neural network becomes deeper, gradients—measures of how changes in network parameters affect the loss function—can decrease exponentially, leading to a situation where early layers of the network are effecti…
  continue reading
 
A scientific paper exploring the development and evaluation of language agents for automating data-driven scientific discovery. The authors introduce a new benchmark called ScienceAgentBench, which consists of 102 diverse tasks extracted from peer-reviewed publications across four disciplines: Bioinformatics, Computational Chemistry, Geographical I…
  continue reading
 
We discuss how to utilize the processing power of Graphics Processing Units (GPUs) to speed up deep learning calculations, particularly in the context of training neural networks. It outlines how to assign data to different GPUs to minimize data transfer times, a crucial aspect of performance optimization. The text highlights the importance of unde…
  continue reading
 
This paper provides a comprehensive overview of deep generative models (DGMs) and their applications within transportation research. It begins by outlining the fundamental principles and concepts of DGMs, focusing on various model types such as Variational Autoencoders (VAEs), Generative Adversarial Networks (GANs), Normalizing Flows, and Diffusion…
  continue reading
 
This research paper presents the development and evaluation of an AI-driven Smart Video Solution (SVS) designed to enhance community safety. The SVS utilizes existing CCTV infrastructure and leverages recent advancements in AI for anomaly detection, leveraging pose-based data to ensure privacy. The system provides real-time alerts to stakeholders t…
  continue reading
 
The book titled "Mathematics for Machine Learning" explains various mathematical concepts that are essential for understanding machine learning algorithms, including linear algebra, analytic geometry, vector calculus, and probability. It also discusses topics such as model selection, parameter estimation, dimensionality reduction, and classificatio…
  continue reading
 
Here we discuss three different papers (see links below) on using D-CNNs to detect breast cancer. The first source details the development and evaluation of HIPPO, a novel explainable AI method that enhances the interpretability and trustworthiness of ABMIL models in computational pathology. HIPPO aims to address the challenges of opaque decision-m…
  continue reading
 
This LessWrong post explores various methods to enhance human intelligence, aiming to create individuals with significantly higher cognitive abilities than the current population. The author, TsviBT, proposes numerous approaches ranging from gene editing to brain-computer interfaces and brain emulation, discussing their potential benefits and drawb…
  continue reading
 
The first source is a blog post by Max Mynter, a machine learning engineer, outlining a five-to-seven step roadmap for becoming a machine learning engineer. The post emphasizes the importance of both software engineering and data science skills alongside mathematics and domain knowledge. It then offers concrete resources, including courses and book…
  continue reading
 
We discusses the importance of generalization in classification, where the goal is to train a model that can accurately predict labels for previously unseen data. The text first explores the role of test sets in evaluating model performance, emphasizing the need to use them sparingly and cautiously to avoid overfitting. It then introduces the conce…
  continue reading
 
Recognizing laughter in audio is actually a very difficult ML problem, filled with failure. Much like most comedians' jokes. Let's hope some good stuff survives. This is a review of a student's final year project for a University of Edinburgh computer science course. The project focused on creating a machine learning model to detect laughter in vid…
  continue reading
 
Solving an impossible mystery... forget what you thought was possible! This is a discussion of a video from Stanford's CS224W course which focuses on the many applications of graph machine learning, a field that utilizes graph data structures to solve complex problems. The speaker highlights different tasks and their associated applications, classi…
  continue reading
 
A research team from EyeLevel.ai has found that vector databases, which are commonly used in RAG (Retrieval-Augmented Generation) systems, have a scaling problem. Their research shows that the accuracy of vector similarity search degrades significantly as the number of pages in the database increases, leading to a substantial performance hit. This …
  continue reading
 
Probability and statistics are fundamental components of machine learning (ML) and deep learning (DL) because they provide the mathematical framework for understanding and analyzing data, which is crucial for making predictions and decisions. This excerpt from the "Dive into Deep Learning" documentation explains the essential concepts of probabilit…
  continue reading
 
This research paper examines a new deep-learning approach to optimizing weather forecasts by adjusting initial conditions. The authors test their method on the 2021 Pacific Northwest heatwave, finding that small changes to initial conditions can significantly improve the accuracy of 10-day forecasts using both the GraphCast and Pangu-Weather deep-l…
  continue reading
 
An introduction to the fundamental concepts of calculus, explaining how they are essential for understanding deep learning. It begins by illustrating the concept of a limit using the calculation of a circle's area, before introducing the concept of a derivative, which describes a function's rate of change. It then extends these concepts to multivar…
  continue reading
 
The source, "Generative AI's Act o1: The Reasoning Era Begins | Sequoia Capital," discusses the evolution of AI models from simply mimicking patterns to engaging in more deliberate reasoning. The authors argue that the next frontier in AI is the development of "System 2" thinking, where models can reason through complex problems and make decisions …
  continue reading
 
Swarm is an experimental, educational framework from OpenAI that explores ergonomic interfaces for multi-agent systems. It is not intended for production use, but serves as a learning tool for developers interested in multi-agent orchestration. Swarm uses two main concepts: Agents and handoffs. Agents are entities that encapsulate instructions and …
  continue reading
 
The provided sources detail the groundbreaking work of three scientists who were awarded the 2024 Nobel Prize in Chemistry for their contributions to protein structure prediction using artificial intelligence. David Baker, a biochemist, developed a computer program to create entirely new proteins, while Demis Hassabis and John Jumper, from Google D…
  continue reading
 
Dario Amodei, CEO of Anthropic, argues that powerful AI could revolutionize various fields, including healthcare, neuroscience, economics, and governance, within 5-10 years. He envisions a future where AI could cure most diseases, eradicate poverty, and even promote democracy. However, this optimistic vision is met with skepticism from Reddit users…
  continue reading
 
This paper examines the rapidly developing field of Retrieval-Augmented Generation (RAG), which aims to improve the capabilities of Large Language Models (LLMs) by incorporating external knowledge. The paper reviews the evolution of RAG paradigms, from the early "Naive RAG" to the more sophisticated "Advanced RAG" and "Modular RAG" approaches. It e…
  continue reading
 
This research paper investigates the challenges of detecting Out-of-Distribution (OOD) inputs in medical image segmentation tasks, particularly in the context of Multiple Sclerosis (MS) lesion segmentation. The authors propose a novel evaluation framework that uses 14 different sources of OOD, including synthetic artifacts and real-world variations…
  continue reading
 
This paper presents a new architecture for large language models called DIFF Transformer. The paper argues that conventional Transformers over-allocate attention to irrelevant parts of the input, drowning out the signal needed for accurate output. DIFF Transformer tackles this issue by using a differential attention mechanism that subtracts two sof…
  continue reading
 
The source is a blog post that describes the author's journey in exploring the potential of data pruning to improve the performance of AI models. They start by discussing the Minipile method, a technique for creating high-quality datasets by clustering and manually discarding low-quality content. The author then explores the concept of "foundationa…
  continue reading
 
This paper details the authors' research journey to replicate OpenAI's "O1" language model, which is designed to solve complex reasoning tasks. The researchers document their process with detailed insights, hypotheses, and challenges encountered. They present a novel paradigm called "Journey Learning" that enables models to learn the complete explo…
  continue reading
 
Let's get into the core processes of forward propagation and backpropagation in neural networks, which form the foundation of training these models. Forward propagation involves calculating the outputs of a neural network, starting with the input layer and moving towards the output layer. Backpropagation then calculates the gradients of the network…
  continue reading
 
This research introduces MLE-bench, a benchmark for evaluating how well AI agents perform machine learning engineering tasks. The benchmark is comprised of 75 Kaggle competitions, chosen for their difficulty and representativeness of real-world ML engineering skills. Researchers evaluated several state-of-the-art language models on MLE-bench, findi…
  continue reading
 
This systematic literature review investigates the use of convolutional neural networks (CNNs) for segmenting and classifying dental images. The review analyzes 45 studies that employed CNNs for various tasks, including tooth detection, periapical lesion detection, caries identification, and age and sex determination. The authors explore the differ…
  continue reading
 
This research paper proposes an AI-driven diagnostic system for Temporomandibular Joint Disorders (TMD) using MRI images. The system employs a segmentation method to identify key anatomical structures like the temporal bone, temporomandibular joint (TMJ) disc, and condyle. Using these identified structures, the system utilizes a decision tree based…
  continue reading
 
This research explores the potential for integrating ChatGPT and large language models (LLMs) into dental diagnostics and treatment. The authors investigate the use of these AI tools in various areas of dentistry, including diagnosis, treatment planning, patient education, and dental research. The study examines the benefits and limitations of LLMs…
  continue reading
 
This research paper explores the link between temporomandibular disorder (TMD) and obstructive sleep apnea (OSA). The authors created a machine learning algorithm to predict the presence of OSA in TMD patients using multimodal data, including clinical characteristics, portable polysomnography, X-ray, and MRI. Their model achieved high accuracy, wit…
  continue reading
 
This article describes a clinical validation study that investigates the effectiveness of a deep learning algorithm for detecting dental anomalies in intraoral radiographs. The algorithm is trained to detect six common anomaly types and is compared to the performance of dentists who evaluate the images without algorithmic assistance. The study util…
  continue reading
 
Loading …

Hurtigreferanseguide

Copyright 2024 | Sitemap | Personvern | Vilkår for bruk | | opphavsrett