SlideShare a Scribd company logo
1 of 60
Jure Leskovec
Chief Scientist
Machine Learning at Pinterest
Confidential
Pinterest is a visual bookmarking tool
and discovery engine
Users pin images and sites they like onto
boards
Every pin on Pinterest is added by a
human and lives on a board
Users heavily curate their content
What is Pinterest?
Confidential
• Image
• URL: http://www.culinaria.com…
• User-generated details
• User-curated pin-board graph
• User-curated annotations
• On-site performance (click actions,
impressions, …)
• Web crawl data
What is a Pin?
Confidential
Internet
Offsite
Save
Do
Pinterest
Pinterest is a Visual Discovery Engine
Confidential
Pinterest: Pins and Boards
Pin Board
Confidential
Pinterest is a Giant Bipartite Graph
30+ Billion Pins
categorized by people into more than
750+ Million Boards
Confidential
Many parts driven by ML
Personalization
• Pin and board recommendations
• New-user topic recommendations
Notifications
• Email timing, frequency, content
Ads and monetization
• User action prediction
Related pins
• Which pins are related to a given pin
Ranking
• Homefeed pin ranking
ML at Pinterest
What interests shall
we recommend to a
new user?
Example ML projects at Pinterest
[Pong Eksombatchai, Dave Cummings, Pei Yin, Dan Frankowski]
Confidential
What are the interests of a user?
New User Sign-up Flow
Confidential
User has just joined, they have no clue what
Pinterest is
• Problem: Product comprehension
We have tens of thousands of interests to
recommend from
• Problem: We cannot score all the interests
Business metric we want to optimize is WAR28
(weekly active repinner after 28 days)
• Problem: What is the right notion of a positive label?
Why is it hard?
How to generate
engaging homefeed?
Example ML projects at Pinterest
[Mukund Narasimhan, Yuchen Lie, Dmitry Chechik, Yunsong Guo, …]
Confidential
Diverse, Relevant, Endless set of pins to a user
Show pins and content meaningful to a user without a
specific query
Combines content from:
• Users or boards you follow
• Interests you follow
• Recommendations
Homefeed
Confidential
Generating candidates
• Find pins that we think you’ll like
Scoring and ranking
• Picking the best of the best among candidates
Blending of different sources
• Followed boards/users/interests, recommendations
Creating final feed
• Doing this for 10s of millions of users multiple times a day
Why is it hard?
Confidential
No diversity. Some pins with low relevance.
Ranked by Time
Confidential
More diversity. More relevance.
Ranked by ML model
How do pins
relate to each other?
Example ML projects at Pinterest
[David Liu, Dmitry Kislyuk, …]
Confidential
Can we discover
relationships
between pins and
fit them into a
giant network?
Confidential
Object Graph: Nodes
Confidential
Object Graph: RelationsSubstitutes
Complements
Confidential
Why is it hard?
Systems challenges
• Billons of pins
• Find related pins of each given pin
Machine learning approach
• Classification vs. Ranking?
Ground-truth labels
• What is a good notion of ground-truth?
• Clicks? How do we de-bias position bias?
Offline evaluation
• What is a good metric for offline evaluation?
Related Pins
What “interests” does
a pin belong to?
[Leon Lin, Lingzhi Luo, Ningning Hu, Eugene Ie, Tao Cheng, …]
Example ML projects at Pinterest
Example: Interest Classification
Women’s
Fashion
Food & Drink
Geek
Confidential
TASK: Given a pin, determine its interest(s)
From Pins to Interests
Black Box
Food&Drink
Lower back tattoos
Canoeing
…
Hair
Geek
Confidential
Some interests are specific, others are general
Huge interest size imbalance: 10% to 0.1%
• Problem: Always saying “not my interest” is 99%
correct
Don’t know the interest sizes in the “wild”
• Problem: Overpredict rare, underpredict common
ones
Solution has to scale to 1000s interests and many
languages
• We developed on English, deployed in French
Why is it hard?
learned at Pinterest
Lessons
Confidential
Generating candidates
• Find pins that we think you’ll like
Scoring and ranking
• Picking the best of the best among candidates
Blending of different sources
• Followed boards/users/topics, recommendations
Creating final recommendations
• Doing this for 10s of millions of users multiple times a day
Machine Learning Problems
Problems we’re trying to solve
Confidential
No dataset
• We have to create a dataset
• Which users to use? What time period?
No labels
• We have to pick the labels
• What is a good signal for positve/negative label?
• Can “no label” be considered as “negative label”?
Deployment
• We have to serve the model to 100m+ users
• How do we generate, store, and query features?
• How do we score the recommendations?
Many Challenges
Know your data
Carefully think about the input data
More is better
Don’t be afraid to try many times!
Evaluation is hard
Move fast but be scientific about it :)
Lessons Learned
1
2
3
What did we learn along the way?
Know your data
Learning 1
1
There is no objective dataset
Production changes everything
Make it easy to look at the raw data, raw
results…
Build intuition about the data and what
steps to take next
Confidential
• There are lots of subtleties in how training data is
generated
- How the data is sampled matters
- The characteristics of the data changes with time
- Distributions change upon deployment
- We make choices based on computational constraints (ratio of
positive to negative instances, size of data set)
• Varying these have a bigger impact on the final
model than varying algorithms
• More important to examine/vary/test these than (for
example) the regularization parameter
There is no Objective Dataset
Confidential
• The data distribution is different
- Need to deal with missing data
- Need to deal with malformed data
- Systems have to work under difficult circumstances
Upstream services may go down, but system should continue
to provide reasonable responses
Defining fallback behavior is important
• Offline/Online consistency takes work
• Investment in monitoring, measurement,
deployment, debugging is crucial
Production Changes Everything
Example: Interest Classification
Women’s
Fashion
Food & Drink
Geek
Approach: One vs. Rest
Geek
W’s Fashion
Food and Drink
Canoeing
…
Interest
Classifier
Geek Women’s
fashion
Canoeing
Food and drink
Canoeing
Classifier
Approach: One vs. Rest
Geek
W’s Fashion
Food and Drink
Canoeing
…
Interest
Classifier
Geek Women’s
fashion
Canoeing
Food and drink
Geek
Classifier
Production Data Distribution does not Match
Geek Women’s
fashion
Canoeing
Food and drink
Geek
Classifier
Unlabeled
pins
Production Data Distribution does not Match
Geek Women’s
fashion
Canoeing
Food and drink
Geek
Classifier
Carefully think
about biases in
the data
Unlabeled
pins
Hairstyles: Not enough unlabeled
Hairstyles: Too much unlabeled
Hairstyles: Just enough unlabeled
More is better
Learning 2
2
• More is better:
- Models, Data, Features, Experiments
• Hard to tell upfront what will work and
what won’t
- Try lots of things
• Optimize for scale, flexibility,
debuggability
- Simple and consistent systems scale better
Confidential
For example:
We started with 39 features…
Quickly expanded to 670,000
features
7x gain in performance (F1 score)!
No manual feature selection.
Let the model select features!
More is better
Classifying pins to interests
Women’s
Fashion
Food &
Drink
Geek
Canoeing: 30k pins, 39 features
Canoeing: 1m pins, 670k features
Confidential
While scaling up be very careful
• Robust systems work in the presence of errors
- Incorrectly implemented features
- Features go missing
- Models translated incorrectly
- Data missing for a subset of users
• Treating ML systems as black boxes, looking
only at their output is dangerous
- Especially when you are not sure what to look for
- And because errors manifest as slightly lower accuracy
- And because you don't know what accuracy to expect
But: ML Systems Hide Bugs
Evaluation is Hard
Learning 3
3
Evaluation is always hard
Not obvious whether an offline metric will
correlate with an online metric
Offline metric is a complex function of
dataset creation, ground-truth labels, and
the ML algorithm
Confidential
Training objectives / Offline metrics / Online
metrics can be very different
- Some correlation is expected, but once your models
are sufficiently optimized, they begin to diverge
- Online metrics are the only ones that matter, but are
very expensive
- Offline metric should predict online metric
Naive split of training / testing is suboptimal
- There is often a lot of subjectivity that goes into
training data selection
- More important that the evaluation data reflect reality
than the evaluation data reflect the training data
Evaluation is Hard
Confidential
• User features:
• Landing page, demographics, Facebook
• Interest features:
• Topics, annotations, etc.
• Model: User-cross-Interests
• Feature hashing
• What are the labels?
• Not what user follows but interests of pins
user is going to interact with in the future
• Negative labels: Seen but not interacted
• Scoring: Score 1k location-gender
specific interests in real-time
New User Interest
Recommendations
User follows interests
User interacts with pins
Idea: Recommend interests
that user is going to interact
with in the future
Confidential
Evaluation
• Number of followed interests (bad)
• Number of pins interacted (good)
• AUC and Precision at top 10
• Baselines: Random, Popularity
In two months we:
• Ran 1,000s of offline experiments
• Trained 1000 of models to find a
useful one
• Generated 2,338 graphs, 148k pin
galleries
New User Interest
Recommendations
Evaluation is Hard
Define clear offline success metrics
• Consider many metrics
Build meaningful baselines
Clear offline metrics allow you to
quickly compare solutions and
prune bad directions
Confidential
Models can live for a long time
- Long term hold outs (> 1 year)
- Not all affects can be observed in a short timeframe
Models should be independent of infrastructure and
environment
- Infrastructure lifetime and Model lifetime should be independent
- Should be able to deploy models in different environments
Harder to track progress over time
- Changes are not additive
- Only way to determine progress is to compare with older
models
Old Models Never Die
Possible Solutions
Performance
Explore and Learn
Systematically explore
Learn from your failures
What should
we do?
What are some best practices?
Confidential
• Having a repeatable, push button, stable process is
enormously valuable
• Automation encourages experimentation
- Try variations easily
- Reduces temptation to bundle changes
- Easy baseline, good starting point
• Regular retraining is enormously valuable
• A new team member should be able to go through a
documented process and end up with a model
which is on par with production
Automation Pays for Itself
Confidential
• We have hundreds of models in production
- Trained by different engineers
- Optimizing for different criteria
- Using different features
- Meant for different purposes
- But running on the same infrastructure
• You need a process for
- Model Storage and Search
- Model Deployment, Documentation and Review
- Keeping Model coupling/dependencies in check
- Tracking experiments, communicating successes and failures
Models Need to be Managed
Confidential
• Make everything explicit (via DSL)
- A (linear) model is not just an array of coefficients
- It should list the source/raw-features
- It should contain the feature transforms
- It should contain the score transform/calibration/link function
- It should document how it was built, who built it,
when it was built, and point to instructions to reproduce it
• Config is better than Code
- Create a well documented model specification language
- That is human readable
- But manipulatable by tools (introspection, refactoring, etc.)
• Minimize dependencies on environment
Avoid Implicit Assumptions
Confidential
• Infrastructure is critical
• Building high quality systems requires experts
from different domains
- How do ML engineers build models without deep understanding
of the infrastructure?
- How do infrastructure experts build/scale/evolve the system?
• Decoupling infrastructure and modeling is hard but
worth it
- Allows people with different backgrounds to work together
- Requires well thought out interfaces
- Which is rarely achieved through organic evolution
There is more to ML Systems
than ML
Confidential
• 100M+ users
Vast, diverse and changing user base makes user modeling a
challenge
Product has to work well for niche as well as mainstream
populations
Optimizing for majority can hurt subgroups
Monitoring needs to be intelligent
• Billions of pieces of content
Modeling is crucial
Need to tradeoff recency, diversity, relevance, and ecosystem
effects
Everything Gets Amplified at
Scale
jure@pinterest.com
Come work with us!
Thanks to Mukund, Dmitry, David, Pong, Dave, and Leon

More Related Content

What's hot

Design Thinking is Killing Creativity
Design Thinking is Killing CreativityDesign Thinking is Killing Creativity
Design Thinking is Killing Creativitydesignsojourn
 
Mental Health Care Technologies: Context-Aware Stress Assessment and Stress C...
Mental Health Care Technologies: Context-Aware Stress Assessment and Stress C...Mental Health Care Technologies: Context-Aware Stress Assessment and Stress C...
Mental Health Care Technologies: Context-Aware Stress Assessment and Stress C...Katarzyna Wac & The QoL Lab
 
Design Thinking: Finding Problems Worth Solving In Health
Design Thinking: Finding Problems Worth Solving In HealthDesign Thinking: Finding Problems Worth Solving In Health
Design Thinking: Finding Problems Worth Solving In HealthAdam Connor
 
Design Thinking In House
Design Thinking In HouseDesign Thinking In House
Design Thinking In HouseMireya Juárez
 
Beyond Design Thinking at DNA
Beyond Design Thinking at DNABeyond Design Thinking at DNA
Beyond Design Thinking at DNAChris Jackson
 
Design Thinking 101 by Natalie Nixon of Figure 8 Thinking
Design Thinking 101 by Natalie Nixon of Figure 8 ThinkingDesign Thinking 101 by Natalie Nixon of Figure 8 Thinking
Design Thinking 101 by Natalie Nixon of Figure 8 ThinkingNatalie W. Nixon, PhD
 
Why Design Thinking is Important for Innovation? - Favarin Vitillo - ViewConf...
Why Design Thinking is Important for Innovation? - Favarin Vitillo - ViewConf...Why Design Thinking is Important for Innovation? - Favarin Vitillo - ViewConf...
Why Design Thinking is Important for Innovation? - Favarin Vitillo - ViewConf...Simone Favarin
 
A speed date with design thinking
A speed date with design thinkingA speed date with design thinking
A speed date with design thinkingZaana Jaclyn
 
State of Design Thinking in Portland
State of Design Thinking in PortlandState of Design Thinking in Portland
State of Design Thinking in Portlanddesignplusstrategy
 
Design Thinking: A Quick Course in Creative Problem Solving
Design Thinking: A Quick Course in Creative Problem SolvingDesign Thinking: A Quick Course in Creative Problem Solving
Design Thinking: A Quick Course in Creative Problem SolvingSpring Studio
 
Design Thinking for Creative Confidence
Design Thinking for Creative ConfidenceDesign Thinking for Creative Confidence
Design Thinking for Creative ConfidenceRenzo D'andrea
 
Design thinking - Piktochart presentation for Barcamp Penang 2013
Design thinking - Piktochart presentation for Barcamp Penang 2013Design thinking - Piktochart presentation for Barcamp Penang 2013
Design thinking - Piktochart presentation for Barcamp Penang 2013Natalija Snapkauskaite
 
ILTACON 2016 Design Thinking Workshop
ILTACON 2016 Design Thinking WorkshopILTACON 2016 Design Thinking Workshop
ILTACON 2016 Design Thinking WorkshopLee-Sean Huang
 
Design Thinking for Children
Design Thinking for ChildrenDesign Thinking for Children
Design Thinking for ChildrenEdwin Dando
 
Design Thinking in Solving Problem - HCMC Scrum Breakfast - July 27, 2019
Design Thinking in Solving Problem - HCMC Scrum Breakfast - July 27, 2019Design Thinking in Solving Problem - HCMC Scrum Breakfast - July 27, 2019
Design Thinking in Solving Problem - HCMC Scrum Breakfast - July 27, 2019Scrum Breakfast Vietnam
 
Digital innovation and human-centered design - 032016
Digital innovation and human-centered design - 032016Digital innovation and human-centered design - 032016
Digital innovation and human-centered design - 032016Michelle Ferrier
 
UXSG2014 Workshop (Day 1) - Leading UX (Trend Micro)
UXSG2014 Workshop (Day 1) - Leading UX (Trend Micro)UXSG2014 Workshop (Day 1) - Leading UX (Trend Micro)
UXSG2014 Workshop (Day 1) - Leading UX (Trend Micro)ux singapore
 

What's hot (20)

Design Thinking is Killing Creativity
Design Thinking is Killing CreativityDesign Thinking is Killing Creativity
Design Thinking is Killing Creativity
 
Mental Health Care Technologies: Context-Aware Stress Assessment and Stress C...
Mental Health Care Technologies: Context-Aware Stress Assessment and Stress C...Mental Health Care Technologies: Context-Aware Stress Assessment and Stress C...
Mental Health Care Technologies: Context-Aware Stress Assessment and Stress C...
 
Design Thinking: Finding Problems Worth Solving In Health
Design Thinking: Finding Problems Worth Solving In HealthDesign Thinking: Finding Problems Worth Solving In Health
Design Thinking: Finding Problems Worth Solving In Health
 
Design Thinking In House
Design Thinking In HouseDesign Thinking In House
Design Thinking In House
 
1 Coffee Pot, Many Disciplines: Why Space Matters for Innovation
1 Coffee Pot, Many Disciplines: Why Space Matters for Innovation1 Coffee Pot, Many Disciplines: Why Space Matters for Innovation
1 Coffee Pot, Many Disciplines: Why Space Matters for Innovation
 
Beyond Design Thinking at DNA
Beyond Design Thinking at DNABeyond Design Thinking at DNA
Beyond Design Thinking at DNA
 
Design Thinking 101 by Natalie Nixon of Figure 8 Thinking
Design Thinking 101 by Natalie Nixon of Figure 8 ThinkingDesign Thinking 101 by Natalie Nixon of Figure 8 Thinking
Design Thinking 101 by Natalie Nixon of Figure 8 Thinking
 
Why Design Thinking is Important for Innovation? - Favarin Vitillo - ViewConf...
Why Design Thinking is Important for Innovation? - Favarin Vitillo - ViewConf...Why Design Thinking is Important for Innovation? - Favarin Vitillo - ViewConf...
Why Design Thinking is Important for Innovation? - Favarin Vitillo - ViewConf...
 
A speed date with design thinking
A speed date with design thinkingA speed date with design thinking
A speed date with design thinking
 
State of Design Thinking in Portland
State of Design Thinking in PortlandState of Design Thinking in Portland
State of Design Thinking in Portland
 
Design Thinking Method Cards (Beta 1.0)
Design Thinking Method Cards (Beta 1.0)Design Thinking Method Cards (Beta 1.0)
Design Thinking Method Cards (Beta 1.0)
 
Design Thinking 101
Design Thinking 101Design Thinking 101
Design Thinking 101
 
Design Thinking: A Quick Course in Creative Problem Solving
Design Thinking: A Quick Course in Creative Problem SolvingDesign Thinking: A Quick Course in Creative Problem Solving
Design Thinking: A Quick Course in Creative Problem Solving
 
Design Thinking for Creative Confidence
Design Thinking for Creative ConfidenceDesign Thinking for Creative Confidence
Design Thinking for Creative Confidence
 
Design thinking - Piktochart presentation for Barcamp Penang 2013
Design thinking - Piktochart presentation for Barcamp Penang 2013Design thinking - Piktochart presentation for Barcamp Penang 2013
Design thinking - Piktochart presentation for Barcamp Penang 2013
 
ILTACON 2016 Design Thinking Workshop
ILTACON 2016 Design Thinking WorkshopILTACON 2016 Design Thinking Workshop
ILTACON 2016 Design Thinking Workshop
 
Design Thinking for Children
Design Thinking for ChildrenDesign Thinking for Children
Design Thinking for Children
 
Design Thinking in Solving Problem - HCMC Scrum Breakfast - July 27, 2019
Design Thinking in Solving Problem - HCMC Scrum Breakfast - July 27, 2019Design Thinking in Solving Problem - HCMC Scrum Breakfast - July 27, 2019
Design Thinking in Solving Problem - HCMC Scrum Breakfast - July 27, 2019
 
Digital innovation and human-centered design - 032016
Digital innovation and human-centered design - 032016Digital innovation and human-centered design - 032016
Digital innovation and human-centered design - 032016
 
UXSG2014 Workshop (Day 1) - Leading UX (Trend Micro)
UXSG2014 Workshop (Day 1) - Leading UX (Trend Micro)UXSG2014 Workshop (Day 1) - Leading UX (Trend Micro)
UXSG2014 Workshop (Day 1) - Leading UX (Trend Micro)
 

Viewers also liked

The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...The Hive
 
The Hive Think Tank: Heron at Twitter
The Hive Think Tank: Heron at TwitterThe Hive Think Tank: Heron at Twitter
The Hive Think Tank: Heron at TwitterThe Hive
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive
 
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...The Hive
 
Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra MalikDeep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra MalikThe Hive
 
The Hive Think Tank: The Future Of Customer Support - AI Driven Automation
The Hive Think Tank: The Future Of Customer Support - AI Driven AutomationThe Hive Think Tank: The Future Of Customer Support - AI Driven Automation
The Hive Think Tank: The Future Of Customer Support - AI Driven AutomationThe Hive
 
The Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat SrinivasanThe Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat SrinivasanThe Hive
 
The Hive Think Tank: Unpacking AI for Healthcare
The Hive Think Tank: Unpacking AI for Healthcare The Hive Think Tank: Unpacking AI for Healthcare
The Hive Think Tank: Unpacking AI for Healthcare The Hive
 
The essential guide to Google+
The essential guide to Google+The essential guide to Google+
The essential guide to Google+Press Avenue
 
دليل استخدام المدونة
دليل استخدام المدونة دليل استخدام المدونة
دليل استخدام المدونة Ta3lemy
 
Everything You Need To Know About Google Plus
Everything You Need To Know About Google PlusEverything You Need To Know About Google Plus
Everything You Need To Know About Google PlusSocialMotus
 
Redbook
RedbookRedbook
Redbookens007
 
Startup Series: Lean Analytics, Innovation, and Tilting at Windmills
Startup Series: Lean Analytics, Innovation, and Tilting at WindmillsStartup Series: Lean Analytics, Innovation, and Tilting at Windmills
Startup Series: Lean Analytics, Innovation, and Tilting at WindmillsThe Hive
 
Tomer Shiran, MapR_Hadoop&SQL
Tomer Shiran, MapR_Hadoop&SQLTomer Shiran, MapR_Hadoop&SQL
Tomer Shiran, MapR_Hadoop&SQLThe Hive
 
My magazine edited
My magazine editedMy magazine edited
My magazine editedsofiamorana1
 

Viewers also liked (20)

The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
The Hive Think Tank: Machine Learning Applications in Genomics by Prof. Jian ...
 
The Hive Think Tank: Heron at Twitter
The Hive Think Tank: Heron at TwitterThe Hive Think Tank: Heron at Twitter
The Hive Think Tank: Heron at Twitter
 
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
The Hive Think Tank - The Microsoft Big Data Stack by Raghu Ramakrishnan, CTO...
 
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
The Hive Think Tank: Translating IoT into Innovation at Every Level by Prith ...
 
Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra MalikDeep Visual Understanding from Deep Learning by Prof. Jitendra Malik
Deep Visual Understanding from Deep Learning by Prof. Jitendra Malik
 
The Hive Think Tank: The Future Of Customer Support - AI Driven Automation
The Hive Think Tank: The Future Of Customer Support - AI Driven AutomationThe Hive Think Tank: The Future Of Customer Support - AI Driven Automation
The Hive Think Tank: The Future Of Customer Support - AI Driven Automation
 
The Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat SrinivasanThe Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
The Hive Think Tank: AI in The Enterprise by Venkat Srinivasan
 
The Hive Think Tank: Unpacking AI for Healthcare
The Hive Think Tank: Unpacking AI for Healthcare The Hive Think Tank: Unpacking AI for Healthcare
The Hive Think Tank: Unpacking AI for Healthcare
 
Pinferences
Pinferences Pinferences
Pinferences
 
Google + = ?
Google + = ?Google + = ?
Google + = ?
 
Shared interest graph
Shared interest graphShared interest graph
Shared interest graph
 
The essential guide to Google+
The essential guide to Google+The essential guide to Google+
The essential guide to Google+
 
دليل استخدام المدونة
دليل استخدام المدونة دليل استخدام المدونة
دليل استخدام المدونة
 
Social Media Workshop - #ShababShare
Social Media Workshop - #ShababShare Social Media Workshop - #ShababShare
Social Media Workshop - #ShababShare
 
Blogger
BloggerBlogger
Blogger
 
Everything You Need To Know About Google Plus
Everything You Need To Know About Google PlusEverything You Need To Know About Google Plus
Everything You Need To Know About Google Plus
 
Redbook
RedbookRedbook
Redbook
 
Startup Series: Lean Analytics, Innovation, and Tilting at Windmills
Startup Series: Lean Analytics, Innovation, and Tilting at WindmillsStartup Series: Lean Analytics, Innovation, and Tilting at Windmills
Startup Series: Lean Analytics, Innovation, and Tilting at Windmills
 
Tomer Shiran, MapR_Hadoop&SQL
Tomer Shiran, MapR_Hadoop&SQLTomer Shiran, MapR_Hadoop&SQL
Tomer Shiran, MapR_Hadoop&SQL
 
My magazine edited
My magazine editedMy magazine edited
My magazine edited
 

Similar to The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec

Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018
Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018 Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018
Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018 Ria Sankar
 
Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveoralonso
 
When Mobile meets UX/UI powered by Growth Hacking Asia
When Mobile meets UX/UI powered by Growth Hacking AsiaWhen Mobile meets UX/UI powered by Growth Hacking Asia
When Mobile meets UX/UI powered by Growth Hacking AsiaGrowth Hacking Asia
 
ASC Marketing Workshop - Mar 2012
ASC Marketing Workshop - Mar 2012ASC Marketing Workshop - Mar 2012
ASC Marketing Workshop - Mar 2012TRG Arts
 
IDM Assignment revision certificate Nov '11
IDM Assignment revision certificate Nov '11IDM Assignment revision certificate Nov '11
IDM Assignment revision certificate Nov '11Steve Kemish
 
Think tank - Data Culture for a Better Business
Think tank - Data Culture for a Better BusinessThink tank - Data Culture for a Better Business
Think tank - Data Culture for a Better BusinessDan Cave
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyKris Jack
 
Introduction to Digital Life (March 2017)
Introduction to Digital Life (March 2017)Introduction to Digital Life (March 2017)
Introduction to Digital Life (March 2017)KR_Barker
 
Demystifying Recommendation Systems
Demystifying Recommendation SystemsDemystifying Recommendation Systems
Demystifying Recommendation SystemsRumman Chowdhury
 
Intro to Product Management
Intro to Product Management Intro to Product Management
Intro to Product Management Ria Sankar
 
Getting Under the Hood: What Analytics and Metrics Can Show You About Your We...
Getting Under the Hood: What Analytics and Metrics Can Show You About Your We...Getting Under the Hood: What Analytics and Metrics Can Show You About Your We...
Getting Under the Hood: What Analytics and Metrics Can Show You About Your We...Hartford Foundation for Public Giving
 
Building an Excellent Web Startup
Building an Excellent Web StartupBuilding an Excellent Web Startup
Building an Excellent Web Startupmatthewhyatt
 
Tech and Ethics
Tech and EthicsTech and Ethics
Tech and EthicsDeb Osborn
 
User Onboarding - Startup Launchpad - Masterclass 2019
User Onboarding - Startup Launchpad - Masterclass 2019User Onboarding - Startup Launchpad - Masterclass 2019
User Onboarding - Startup Launchpad - Masterclass 2019Marie-Rose Tripault
 
Introduction to Digital Life (October 2016)
Introduction to Digital Life (October 2016)Introduction to Digital Life (October 2016)
Introduction to Digital Life (October 2016)KR_Barker
 
An Introduction to the World of User Research
An Introduction to the World of User ResearchAn Introduction to the World of User Research
An Introduction to the World of User ResearchMethods
 

Similar to The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec (20)

Dlf 2012
Dlf 2012Dlf 2012
Dlf 2012
 
Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018
Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018 Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018
Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018
 
Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspective
 
Designing Mobile UX
Designing Mobile UXDesigning Mobile UX
Designing Mobile UX
 
When Mobile meets UX/UI powered by Growth Hacking Asia
When Mobile meets UX/UI powered by Growth Hacking AsiaWhen Mobile meets UX/UI powered by Growth Hacking Asia
When Mobile meets UX/UI powered by Growth Hacking Asia
 
ASC Marketing Workshop - Mar 2012
ASC Marketing Workshop - Mar 2012ASC Marketing Workshop - Mar 2012
ASC Marketing Workshop - Mar 2012
 
IDM Assignment revision certificate Nov '11
IDM Assignment revision certificate Nov '11IDM Assignment revision certificate Nov '11
IDM Assignment revision certificate Nov '11
 
Think tank - Data Culture for a Better Business
Think tank - Data Culture for a Better BusinessThink tank - Data Culture for a Better Business
Think tank - Data Culture for a Better Business
 
Modern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in MendeleyModern Perspectives on Recommender Systems and their Applications in Mendeley
Modern Perspectives on Recommender Systems and their Applications in Mendeley
 
Introduction to Digital Life (March 2017)
Introduction to Digital Life (March 2017)Introduction to Digital Life (March 2017)
Introduction to Digital Life (March 2017)
 
Demystifying Recommendation Systems
Demystifying Recommendation SystemsDemystifying Recommendation Systems
Demystifying Recommendation Systems
 
Intro to Product Management
Intro to Product Management Intro to Product Management
Intro to Product Management
 
Getting Under the Hood: What Analytics and Metrics Can Show You About Your We...
Getting Under the Hood: What Analytics and Metrics Can Show You About Your We...Getting Under the Hood: What Analytics and Metrics Can Show You About Your We...
Getting Under the Hood: What Analytics and Metrics Can Show You About Your We...
 
Building an Excellent Web Startup
Building an Excellent Web StartupBuilding an Excellent Web Startup
Building an Excellent Web Startup
 
Tech and Ethics
Tech and EthicsTech and Ethics
Tech and Ethics
 
User Onboarding - Startup Launchpad - Masterclass 2019
User Onboarding - Startup Launchpad - Masterclass 2019User Onboarding - Startup Launchpad - Masterclass 2019
User Onboarding - Startup Launchpad - Masterclass 2019
 
Design process
Design processDesign process
Design process
 
Introduction to Digital Life (October 2016)
Introduction to Digital Life (October 2016)Introduction to Digital Life (October 2016)
Introduction to Digital Life (October 2016)
 
Proyectos Investigación y Desarrollo
Proyectos Investigación y DesarrolloProyectos Investigación y Desarrollo
Proyectos Investigación y Desarrollo
 
An Introduction to the World of User Research
An Introduction to the World of User ResearchAn Introduction to the World of User Research
An Introduction to the World of User Research
 

More from The Hive

"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie Muirhead"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie MuirheadThe Hive
 
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...The Hive
 
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoTDigital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoTThe Hive
 
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18The Hive
 
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the EnterpriseThe Hive
 
AI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the EnterpriseAI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the EnterpriseThe Hive
 
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...The Hive
 
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell AutomationThe Hive
 
Social Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve OmohundroSocial Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve OmohundroThe Hive
 
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...The Hive
 
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital ChangeThe Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital ChangeThe Hive
 
The Hive Think Tank: Sidechains by Adam Back, President of Blockstream
The Hive Think Tank: Sidechains by Adam Back, President of BlockstreamThe Hive Think Tank: Sidechains by Adam Back, President of Blockstream
The Hive Think Tank: Sidechains by Adam Back, President of BlockstreamThe Hive
 
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.The Hive
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank:  Rocking the Database World with RocksDBThe Hive Think Tank:  Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive
 
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQL
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQLThe Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQL
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQLThe Hive
 
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of TwitterThe Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of TwitterThe Hive
 
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapRThe Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapRThe Hive
 

More from The Hive (20)

"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie Muirhead"Responsible AI", by Charlie Muirhead
"Responsible AI", by Charlie Muirhead
 
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
Translating a Trillion Points of Data into Therapies, Diagnostics, and New In...
 
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoTDigital Transformation; Digital Twins for Delivering Business Value in IIoT
Digital Transformation; Digital Twins for Delivering Business Value in IIoT
 
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
Quantum Computing (IBM Q) - Hive Think Tank Event w/ Dr. Bob Sutor - 02.22.18
 
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
The Hive Think Tank: Rendezvous Architecture Makes Machine Learning Logistics...
 
Data Science in the Enterprise
Data Science in the EnterpriseData Science in the Enterprise
Data Science in the Enterprise
 
AI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the EnterpriseAI in Software for Augmenting Intelligence Across the Enterprise
AI in Software for Augmenting Intelligence Across the Enterprise
 
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
“ High Precision Analytics for Healthcare: Promises and Challenges” by Sriram...
 
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
"The Future of Manufacturing" by Sujeet Chand, SVP&CTO, Rockwell Automation
 
Social Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve OmohundroSocial Impact & Ethics of AI by Steve Omohundro
Social Impact & Ethics of AI by Steve Omohundro
 
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
The Hive Think Tank: Talk by Mohandas Pai - India at 2030, How Tech Entrepren...
 
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital ChangeThe Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
The Hive Think Tank: The Content Trap - Strategist's Guide to Digital Change
 
The Hive Think Tank: Sidechains by Adam Back, President of Blockstream
The Hive Think Tank: Sidechains by Adam Back, President of BlockstreamThe Hive Think Tank: Sidechains by Adam Back, President of Blockstream
The Hive Think Tank: Sidechains by Adam Back, President of Blockstream
 
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.
The Hive Think Tank: Ceph + RocksDB by Sage Weil, Red Hat.
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank:  Rocking the Database World with RocksDBThe Hive Think Tank:  Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
 
The Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDBThe Hive Think Tank: Rocking the Database World with RocksDB
The Hive Think Tank: Rocking the Database World with RocksDB
 
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQL
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQLThe Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQL
The Hive Think Tank: Stream Processing Systems by Nikita Shamgunov of MemSQL
 
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of TwitterThe Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter
The Hive Think Tank: "Stream Processing Systems" by Karthik Ramasamy of Twitter
 
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapRThe Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR
The Hive Think Tank: "Stream Processing Systems" by M.C. Srivas of MapR
 

Recently uploaded

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 

The Hive Think Tank: Machine Learning at Pinterest by Jure Leskovec

  • 2. Confidential Pinterest is a visual bookmarking tool and discovery engine Users pin images and sites they like onto boards Every pin on Pinterest is added by a human and lives on a board Users heavily curate their content What is Pinterest?
  • 3. Confidential • Image • URL: http://www.culinaria.com… • User-generated details • User-curated pin-board graph • User-curated annotations • On-site performance (click actions, impressions, …) • Web crawl data What is a Pin?
  • 6. Confidential Pinterest is a Giant Bipartite Graph
  • 7. 30+ Billion Pins categorized by people into more than 750+ Million Boards
  • 8. Confidential Many parts driven by ML Personalization • Pin and board recommendations • New-user topic recommendations Notifications • Email timing, frequency, content Ads and monetization • User action prediction Related pins • Which pins are related to a given pin Ranking • Homefeed pin ranking ML at Pinterest
  • 9. What interests shall we recommend to a new user? Example ML projects at Pinterest [Pong Eksombatchai, Dave Cummings, Pei Yin, Dan Frankowski]
  • 10. Confidential What are the interests of a user? New User Sign-up Flow
  • 11.
  • 12. Confidential User has just joined, they have no clue what Pinterest is • Problem: Product comprehension We have tens of thousands of interests to recommend from • Problem: We cannot score all the interests Business metric we want to optimize is WAR28 (weekly active repinner after 28 days) • Problem: What is the right notion of a positive label? Why is it hard?
  • 13. How to generate engaging homefeed? Example ML projects at Pinterest [Mukund Narasimhan, Yuchen Lie, Dmitry Chechik, Yunsong Guo, …]
  • 14. Confidential Diverse, Relevant, Endless set of pins to a user Show pins and content meaningful to a user without a specific query Combines content from: • Users or boards you follow • Interests you follow • Recommendations Homefeed
  • 15. Confidential Generating candidates • Find pins that we think you’ll like Scoring and ranking • Picking the best of the best among candidates Blending of different sources • Followed boards/users/interests, recommendations Creating final feed • Doing this for 10s of millions of users multiple times a day Why is it hard?
  • 16. Confidential No diversity. Some pins with low relevance. Ranked by Time
  • 17. Confidential More diversity. More relevance. Ranked by ML model
  • 18. How do pins relate to each other? Example ML projects at Pinterest [David Liu, Dmitry Kislyuk, …]
  • 19. Confidential Can we discover relationships between pins and fit them into a giant network?
  • 22. Confidential Why is it hard? Systems challenges • Billons of pins • Find related pins of each given pin Machine learning approach • Classification vs. Ranking? Ground-truth labels • What is a good notion of ground-truth? • Clicks? How do we de-bias position bias? Offline evaluation • What is a good metric for offline evaluation? Related Pins
  • 23. What “interests” does a pin belong to? [Leon Lin, Lingzhi Luo, Ningning Hu, Eugene Ie, Tao Cheng, …] Example ML projects at Pinterest
  • 25. Confidential TASK: Given a pin, determine its interest(s) From Pins to Interests Black Box Food&Drink Lower back tattoos Canoeing … Hair Geek
  • 26. Confidential Some interests are specific, others are general Huge interest size imbalance: 10% to 0.1% • Problem: Always saying “not my interest” is 99% correct Don’t know the interest sizes in the “wild” • Problem: Overpredict rare, underpredict common ones Solution has to scale to 1000s interests and many languages • We developed on English, deployed in French Why is it hard?
  • 28. Confidential Generating candidates • Find pins that we think you’ll like Scoring and ranking • Picking the best of the best among candidates Blending of different sources • Followed boards/users/topics, recommendations Creating final recommendations • Doing this for 10s of millions of users multiple times a day Machine Learning Problems Problems we’re trying to solve
  • 29. Confidential No dataset • We have to create a dataset • Which users to use? What time period? No labels • We have to pick the labels • What is a good signal for positve/negative label? • Can “no label” be considered as “negative label”? Deployment • We have to serve the model to 100m+ users • How do we generate, store, and query features? • How do we score the recommendations? Many Challenges
  • 30. Know your data Carefully think about the input data More is better Don’t be afraid to try many times! Evaluation is hard Move fast but be scientific about it :) Lessons Learned 1 2 3 What did we learn along the way?
  • 31. Know your data Learning 1 1 There is no objective dataset Production changes everything Make it easy to look at the raw data, raw results… Build intuition about the data and what steps to take next
  • 32. Confidential • There are lots of subtleties in how training data is generated - How the data is sampled matters - The characteristics of the data changes with time - Distributions change upon deployment - We make choices based on computational constraints (ratio of positive to negative instances, size of data set) • Varying these have a bigger impact on the final model than varying algorithms • More important to examine/vary/test these than (for example) the regularization parameter There is no Objective Dataset
  • 33. Confidential • The data distribution is different - Need to deal with missing data - Need to deal with malformed data - Systems have to work under difficult circumstances Upstream services may go down, but system should continue to provide reasonable responses Defining fallback behavior is important • Offline/Online consistency takes work • Investment in monitoring, measurement, deployment, debugging is crucial Production Changes Everything
  • 35. Approach: One vs. Rest Geek W’s Fashion Food and Drink Canoeing … Interest Classifier Geek Women’s fashion Canoeing Food and drink Canoeing Classifier
  • 36. Approach: One vs. Rest Geek W’s Fashion Food and Drink Canoeing … Interest Classifier Geek Women’s fashion Canoeing Food and drink Geek Classifier
  • 37. Production Data Distribution does not Match Geek Women’s fashion Canoeing Food and drink Geek Classifier Unlabeled pins
  • 38. Production Data Distribution does not Match Geek Women’s fashion Canoeing Food and drink Geek Classifier Carefully think about biases in the data Unlabeled pins
  • 40. Hairstyles: Too much unlabeled
  • 42. More is better Learning 2 2 • More is better: - Models, Data, Features, Experiments • Hard to tell upfront what will work and what won’t - Try lots of things • Optimize for scale, flexibility, debuggability - Simple and consistent systems scale better
  • 43. Confidential For example: We started with 39 features… Quickly expanded to 670,000 features 7x gain in performance (F1 score)! No manual feature selection. Let the model select features! More is better Classifying pins to interests Women’s Fashion Food & Drink Geek
  • 44. Canoeing: 30k pins, 39 features
  • 45. Canoeing: 1m pins, 670k features
  • 46. Confidential While scaling up be very careful • Robust systems work in the presence of errors - Incorrectly implemented features - Features go missing - Models translated incorrectly - Data missing for a subset of users • Treating ML systems as black boxes, looking only at their output is dangerous - Especially when you are not sure what to look for - And because errors manifest as slightly lower accuracy - And because you don't know what accuracy to expect But: ML Systems Hide Bugs
  • 47. Evaluation is Hard Learning 3 3 Evaluation is always hard Not obvious whether an offline metric will correlate with an online metric Offline metric is a complex function of dataset creation, ground-truth labels, and the ML algorithm
  • 48. Confidential Training objectives / Offline metrics / Online metrics can be very different - Some correlation is expected, but once your models are sufficiently optimized, they begin to diverge - Online metrics are the only ones that matter, but are very expensive - Offline metric should predict online metric Naive split of training / testing is suboptimal - There is often a lot of subjectivity that goes into training data selection - More important that the evaluation data reflect reality than the evaluation data reflect the training data Evaluation is Hard
  • 49. Confidential • User features: • Landing page, demographics, Facebook • Interest features: • Topics, annotations, etc. • Model: User-cross-Interests • Feature hashing • What are the labels? • Not what user follows but interests of pins user is going to interact with in the future • Negative labels: Seen but not interacted • Scoring: Score 1k location-gender specific interests in real-time New User Interest Recommendations User follows interests User interacts with pins Idea: Recommend interests that user is going to interact with in the future
  • 50. Confidential Evaluation • Number of followed interests (bad) • Number of pins interacted (good) • AUC and Precision at top 10 • Baselines: Random, Popularity In two months we: • Ran 1,000s of offline experiments • Trained 1000 of models to find a useful one • Generated 2,338 graphs, 148k pin galleries New User Interest Recommendations
  • 51. Evaluation is Hard Define clear offline success metrics • Consider many metrics Build meaningful baselines Clear offline metrics allow you to quickly compare solutions and prune bad directions
  • 52. Confidential Models can live for a long time - Long term hold outs (> 1 year) - Not all affects can be observed in a short timeframe Models should be independent of infrastructure and environment - Infrastructure lifetime and Model lifetime should be independent - Should be able to deploy models in different environments Harder to track progress over time - Changes are not additive - Only way to determine progress is to compare with older models Old Models Never Die
  • 53. Possible Solutions Performance Explore and Learn Systematically explore Learn from your failures
  • 54. What should we do? What are some best practices?
  • 55. Confidential • Having a repeatable, push button, stable process is enormously valuable • Automation encourages experimentation - Try variations easily - Reduces temptation to bundle changes - Easy baseline, good starting point • Regular retraining is enormously valuable • A new team member should be able to go through a documented process and end up with a model which is on par with production Automation Pays for Itself
  • 56. Confidential • We have hundreds of models in production - Trained by different engineers - Optimizing for different criteria - Using different features - Meant for different purposes - But running on the same infrastructure • You need a process for - Model Storage and Search - Model Deployment, Documentation and Review - Keeping Model coupling/dependencies in check - Tracking experiments, communicating successes and failures Models Need to be Managed
  • 57. Confidential • Make everything explicit (via DSL) - A (linear) model is not just an array of coefficients - It should list the source/raw-features - It should contain the feature transforms - It should contain the score transform/calibration/link function - It should document how it was built, who built it, when it was built, and point to instructions to reproduce it • Config is better than Code - Create a well documented model specification language - That is human readable - But manipulatable by tools (introspection, refactoring, etc.) • Minimize dependencies on environment Avoid Implicit Assumptions
  • 58. Confidential • Infrastructure is critical • Building high quality systems requires experts from different domains - How do ML engineers build models without deep understanding of the infrastructure? - How do infrastructure experts build/scale/evolve the system? • Decoupling infrastructure and modeling is hard but worth it - Allows people with different backgrounds to work together - Requires well thought out interfaces - Which is rarely achieved through organic evolution There is more to ML Systems than ML
  • 59. Confidential • 100M+ users Vast, diverse and changing user base makes user modeling a challenge Product has to work well for niche as well as mainstream populations Optimizing for majority can hurt subgroups Monitoring needs to be intelligent • Billions of pieces of content Modeling is crucial Need to tradeoff recency, diversity, relevance, and ecosystem effects Everything Gets Amplified at Scale
  • 60. jure@pinterest.com Come work with us! Thanks to Mukund, Dmitry, David, Pong, Dave, and Leon