background image
background-image-placeholder

Jumpstart Your AI Literacy



Key Milestones of AI Evolution

1943

Warren McCulloch and Walter Pitts published "A Logical Calculus of the Ideas Immanent in Nervous Activity", which proposed neural networks modeled on human brain cells as a framework for artificial intelligence.


1956

Computer scientists including John McCarthy, Marvin Minsky, Claude Shannon & Nathaniel Rochester organized the Dartmouth Workshop, considered by many to be the seminal event where the field of artificial intelligence research was born. The proposal for the workshop first introduced the term "Artificial Intelligence".


1965

Weizenbaum developed the famous "ELIZA", a natural language processing program that simulates conversation.


1974

The first AI winter begins, marked by a decline in funding and interest in AI research due to progress not meeting expectations.


1986

Hinton, Rumelhart, and Williams publish “Learning Representations by Back-Propagating Errors", allowing much deeper neural networks to be trained.


2011

IBM's Watson defeats two former Jeopardy champions.


2015

DeepMind develops AlphaGo, which defeats world champion Lee Sedol in the game of Go.


2021

DeepMind's AlphaFold2 solves the protein-folding problem, paving the way for new drug discoveries and medical breakthroughs.


2022

OpenAI released its flagship product ChatGPT in November and officially ignited global enthusiasm for AI development. The Pandora's Box of AI is opened - there is no turning back on AI at this point.


1950

Alan Turing began formalizing concepts around computation & algorithms that would become foundational to AI. In 1950, Turing published his famous paper "Computing Machinery and Intelligence", which proposed what is famously known "Turing Test".


1961

The first industrial robot, Unimate, goes to work at GM replacing humans on the assembly line.


1966

the "first electronic person" from Standford, Shakey is a general-purpose mobile robot that reasons about its own actions.


1967

Newell and Simon developed the General Problem Solver (GPS), one of the first AI programs that demonstrated human-like problem-solving ability.


1980

Expert systems gain popularity with companies using them for financial forecasting and medical diagnoses.


1997

IBM's Deep Blue defeats chess world champion Kasparov, marking the first time a computer beats a world champion in a complex game.


2012

AI Startup DeepMind develops a deep neural network that recognizes cats in YouTube videos. Facebook creates DeepFace, that can recognize faces with near-human accuracy.


2017

Google's AlphaZero defeats the world's best chess and shogi engines in a series of matches.


Basic Concepts

What is Machine Learning

Machine learning is a field of AI where computers learn from data to identify patterns and make decisions, becoming better at tasks over time without being explicitly programmed for each scenario.


What is a Neural Network?

A type of machine learning algorithm inspired by the human brain's structure and function. It is a network of connected artificial neurons that can learn to recognize patterns in data.


What are Large Language Models (LLMs)?

A type of Artificial Neural Network language model that can achieve general-purpose language understanding and generation. Mainly pre-trained Transformers using self-supervised and semi-supervised learning.


What is the Theory of Mind?

A psychological concept that refers to the ability to understand other people by ascribing mental states to them. This includes the knowledge that others’ beliefs, desires, intentions, emotions, and thoughts may be different from one’s own. Or what's commonly called "Consciousness"

Machine Learning

What is Deep Learning?

A sophisticated method used by computers to learn and make decisions by mimicking the way the human brain operates. Imagine your brain as a very complex network of neurons that work together to help you understand and respond to the world around you. Deep learning tries to replicate this network using Artificial Neural Networks


What is Fine-tuning?

A technique used in machine learning to optimize a pre-trained model for a specific dataset. In this approach, the weights of a pre-trained model are trained on new data.


What is K-Means Clustering?

Supervised (such as Naive Bayes Classifier, Support Vector Machine Learning, Linear Regression, Logistic Regression, and Decision Tree), Unsupervised (K-means clustering), and Reinforcement ML Algorithms.


What is Gradient Descent?

An optimization algorithm that is commonly used to train machine learning models and neural networks. The algorithm works by minimizing the cost function of the model by iteratively adjusting the model’s parameters until the cost function reaches a minimum value.

What are the 3 main types of Machine Learning?

Supervised (such as Naive Bayes Classifier, Support Vector Machine Learning, Linear Regression, Logistic Regression, and Decision Tree), Unsupervised (K-means clustering), and Reinforcement ML Algorithms.


What is Embedding?

Refers to a method used in artificial intelligence to convert complex data, like words, sentences, or even entire documents, into a simpler form that a computer can understand and work with. Mostly used for Text Analysis, Machine Translation, Recommendation System, and Info Retrieval.th


What is Naive Bayes Classifier?

Supervised (such as Naive Bayes Classifier, Support Vector Machine Learning, Linear Regression, Logistic Regression, and Decision Tree), Unsupervised (K-means clustering), and Reinforcement ML Algorithms.


What is Dataset Augmentation?

Refers to the process of increasing the diversity of data available for training models, without actually collecting new data. Commonly used in image and speech recognition machine learning. Techniques include Sampling, Transformations, Data Warping & Noise Injection etc.

Major Neural Network Architecture

FNNs - Feedforward Neural Networks

The simplest type of artificial neural network architecture. In this network, the information moves in only one direction—forward—from the input nodes, through the hidden nodes (if any), and finally to the output nodes. There are no cycles or loops in the network. They are often used for simple pattern recognition and classification tasks.


RNNs - Recurrent Neural Networks

These are neural networks that can process sequences of inputs, such as time series data or natural language text. They use feedback connections to allow information to persist across time steps. Commonly used for tasks such as speech recognition, machine translation, and sentiment analysis.


Transformer Networks

Known for their effectiveness in handling sequential data, especially in natural language processing tasks. Transformers are different from RNNs and LSTMs as they handle sequences through attention mechanisms, allowing them to process input data in parallel and capture long-range dependencies more effectively. They are the backbone of models like BERT and GPT series.


GANs - Generative Adversarial Networks

Commonly used for tasks such as image synthesis, video prediction, and style transfer, it consists of two Neural Networks: a generator network that generates new data samples and a discriminator network that tries to discriminate between the real sample and the generated sample.

CNNs - Convolutional Neural Networks

Especially popular in image recognition and processing, CNNs are designed to automatically and adaptively learn spatial hierarchies of features through backpropagation. They are used in image and video recognition, recommender systems, image classification, medical image analysis, and natural language processing.


LSTMs - Long Short-term Memory Networks

A special kind of RNN, LSTMs are designed to remember long-term dependencies in sequence data. This is particularly useful in applications where the context is important, such as in language translation, text generation, and even in some time-series analysis tasks.


Autoencoders

Used for unsupervised learning, it consists of an encoder network that maps the input data to a lower-dimensional representation and a decoder network that maps it back to the original.


ResNet - Residual Neural Network

Designed to solve the vanishing gradient problem by instruction of "skip connections" to maintain info obtained from previous layers and only train on residual improvement. Makes training on very DEEP neural networks feasible.


Well-known Large Language Models (LLMs)

ChatGPT

The most popular natural language chatbot LLM developed by OpenAI and is based on the transformer architecture. It has been trained on a massive corpus of text data and can perform a wide range of natural language processing (NLP) tasks such as language translation, question answering, and text generation.


LlaMA

Developed by Meta and is based on the transformer architecture. It has been designed to be more efficient and less resource-intensive than other models, making it smaller than many other LLMs.


BART (Bidirectional Auto-Regressive Transformer)

A transformer-based LLM developed by Facebook AI Research that is capable of performing a wide range of NLP tasks such as language translations, question answering, and text generation. It is unique in that it is a denoising autoencoder that can generate high-quality text free of noice and errors.

Claude

Developed by Anthropic and is also based on the transformer architecture. It has been trained on a massive corpus of text data and can perform a wide range of NLP tasks such as language translation, question answering, and text generation. It emphasizes AI safety.


Falcon

Falcon LLM is developed by the Technology Innovation Institute (TII) based in Abu Dhabi, United Arab Emirates. The largest openly available language model, with 180B parameters, and trained on a massive 3.5 trillion tokens using TII's RefinedWeb dataset, it's considered one the most capable LLMs publicly known, outperforming GPT-3.5 and LLaMA2 on various benchmarks.


PaLM (Pathways Language Model)

Developed by Google that aims to handle many tasks at once, learn new tasks quickly, and reflect a better understanding of the world. PaLM is a massive undertaking with ambitious goals and is the first outcome of Pathways, Google’s new AI architecture.