Eat Real Food NYC
Home/About/Our Data

METHODOLOGY

Our Data &
Methodology

Every restaurant in our directory is backed by verifiable public data. Here is exactly where our data comes from, how we process it, and how we calculate our Health Score.

DATA SOURCES

Where our data comes from

πŸ₯

NYC Department of Health and Mental Hygiene (DOHMH)

What data we pull

Restaurant inspection grades, numeric inspection scores, inspection dates, violation histories, and business operational status.

How we process it

We pull the full DOHMH inspection dataset from NYC Open Data and match each record to our restaurant listings by name and address. We retain the most recent inspection grade, the cumulative inspection score, and the inspection date. Restaurants without a current grade are flagged but still included in the directory.

Coverage

100% of restaurants in our directory have been cross-referenced against the DOHMH database. Approximately 92% have a current letter grade (A, B, or C). The remaining 8% are pending re-inspection or are newly opened.

πŸ“

Google Maps Platform

What data we pull

Business names, addresses, latitude/longitude coordinates, user ratings, review counts, phone numbers, websites, working hours, cuisine types, and business photos.

How we process it

Restaurant listings are sourced via the Google Maps Places API. We geocode every address to obtain precise coordinates for map placement. Ratings and review counts are captured at the time of data collection and represent a snapshot. We do not fabricate or modify user ratings.

Coverage

100% of our 8,835 restaurants have verified Google Maps listings. Over 99% have latitude/longitude coordinates. Approximately 85% have phone numbers and 70% have listed websites.

πŸ—ΊοΈ

NYC Neighborhood Tabulation Areas (NTA) GeoJSON

What data we pull

Neighborhood boundaries, neighborhood names, and borough assignments for all NYC neighborhoods.

How we process it

We use the official NYC NTA boundary files from the Department of City Planning to assign every restaurant to its correct neighborhood using point-in-polygon geospatial matching. This ensures that neighborhood pages reflect the actual city-defined boundaries, not approximations.

Coverage

100% of geocoded restaurants are assigned to an NTA-defined neighborhood. We cover all 5 boroughs and over 190 distinct neighborhoods.

πŸ€–

AI-Assisted Dietary Tagging (Claude API)

What data we pull

12 dietary classification tags: vegan, vegetarian, gluten-free, keto, paleo, halal, kosher, dairy-free, nut-free, raw-food, whole-foods, and low-calorie.

How we process it

We use Claude (claude-haiku-4-5 for bulk classification, claude-sonnet-4-20250514 for quality review) to analyze each restaurant's cuisine type, name, menu indicators, and publicly available information to assign dietary tags. Tags are applied conservatively β€” a restaurant is only tagged if it genuinely specializes in or is certified for that dietary category. Halal and kosher tags require evidence of certification, not merely menu items that happen to comply. Every AI-generated tag is subject to manual spot-check review.

Coverage

100% of restaurants have been processed through the dietary tagging pipeline. Approximately 65% of restaurants have at least one dietary tag. The most common tags are vegetarian-friendly (38%), halal (12%), and gluten-free options (11%).

SCORING SYSTEM

How we calculate the Health Score

Every restaurant in our directory receives a Health Score from 0 to 100. This score is not a subjective editorial rating β€” it is a weighted composite derived from verifiable data points. Here is the exact formula:

40

Inspection Grade

Grade A = 40 pts, Grade B = 25 pts, Grade C = 10 pts, No grade = 0 pts. This is the single most important factor because it reflects the official NYC Health Department assessment of food safety and sanitation.

20

Dietary Diversity

Up to 20 points based on the number and relevance of dietary tags. Restaurants with verified certifications (halal, kosher) or dedicated dietary options (fully vegan, dedicated gluten-free kitchen) score higher than those with incidental compliance.

10

Hidden Gem Bonus

10 points awarded to restaurants that qualify as hidden gems. This rewards high-quality restaurants that may not have mass-market visibility but deliver exceptional experiences.

10

User Rating

Scaled from the Google Maps rating. A 5.0 rating = 10 pts, 4.0 = 6 pts, 3.0 = 2 pts, below 3.0 = 0 pts. Ratings are interpolated linearly within each bracket.

10

Track Record

Up to 10 points based on the restaurant's inspection history over time. Restaurants that have maintained a Grade A across multiple inspection cycles receive the full 10 points. A single grade drop reduces this proportionally.

Maximum possible score: 40 + 20 + 10 + 10 + 10 = 100 points

HIDDEN GEMS

How we identify hidden gems

Hidden gems are restaurants that deliver outstanding quality but have not yet achieved widespread recognition. We identify them algorithmically using three strict criteria β€” all three must be met:

βœ“

Rating of 4.5 or higher

The restaurant must have a Google Maps rating of at least 4.5 out of 5.0, indicating consistently excellent customer experiences.

βœ“

Fewer than 200 reviews

The restaurant must have fewer than 200 Google Maps reviews, indicating it has not yet reached mass-market awareness.

βœ“

Currently operational

The restaurant must be confirmed as currently operating. Closed or temporarily shuttered restaurants are excluded.

DATA FRESHNESS

How often we update our data

Weekly

NYC DOHMH inspection grades and scores are refreshed weekly from the NYC Open Data portal to reflect the latest inspection results.

Monthly

Google Maps ratings, review counts, and business status (open/closed) are updated monthly to keep our listings current.

Quarterly

Dietary tags are re-evaluated quarterly. New restaurants added to the DOHMH database are processed through our full pipeline within 30 days of appearing in the city data.

Ongoing

User-submitted corrections are reviewed within 48 hours and applied to the database if verified.

Found an error in our data?

Our data is only as good as the sources we pull from and the processes we apply. If you spot an inaccuracy β€” a wrong grade, a missing dietary tag, a closed restaurant still listed β€” we want to know.

Report a data issue β†’