Updates - October 3, 2024

Updates
New team members
Project management
Project timeline
Data gathering
Data cleaning and prep
Initial NLP efforts
Image classification

Data Gathering

data gathering stats

Data gathering

Finishing up elementvape
Data and code should be available in shared folder
Identified 30 other potential sites if needed (definitely more out there)

Data cleaning

Sticking with previous structure from last time
Shared sample mipod data
data model

NLP Updates

Numeric values performing well with regular expressions
i.e. puffs per device, e-liquid contents, price, etc.
Working on TFN/synthetic, nicotine-free, CBD/THC
Less success when testing with new data
Nicotine salts/freebase not being auto id’d as TFN
Multiple nicotine values not being picked up
Currently working on testing LLMs with fine-tuning
Screens will be next pass

Image Processing Updates

vape images

Image Processing

Have initial pass at “iced” and “screen”
Going to test with additional data
Seeing possibilities of distinguishing screen model
Seeing if possible to improve performance and speed
Working with a pre-existing trained model to find images of vapes (to filter out non-vapes, parts etc.) to filter down data set

Iced

iced vapes

Screens

vapes with screens