Amazon Product Reviews Dataset

Dataset Overview
  Add to Bookmark
Data Type Real Default Task Classification,Regression
Attribute Type Multivariate, Sequential, Temporal aspect Published Year 2018
Area of Dataset Includes ratings, review text, and helpfulness votes Missing Values Yes
No. of Instances 233 million No. of Attribute 16+

Dataset Description:

This dataset is an updated and expanded version of the Amazon review dataset originally released in 2014. It contains a comprehensive collection of customer reviews, product metadata, and relational links useful for recommendation systems and data analysis.

Key Features:
  • Large Scale: Over 233 million reviews spanning from May 1996 to October 2018 (compared to 142.8 million in the 2014 release).
  • Review Data: Includes ratings, review text, and helpfulness votes.
  • Rich Metadata: Enhanced product information such as color, size, package type, bullet-point descriptions, technical details (attribute-value pairs), and images taken post-purchase.
  • Relational Links: Graph data showing “also viewed” and “also bought” product relationships.
  • Expanded Categories: Five new product categories added to cover a wider range of items.
Applications:

Ideal for building and evaluating machine learning models in sentiment analysis, recommendation systems, opinion mining, and other NLP and data mining tasks.

Formats Available:

Typically available in JSON or CSV formats.

Source: 
https://snap.stanford.edu/data/web-Amazon-links.html

Download Dataset   Add to Bookmark

Share   Share