Updates - October 3, 2024¶
Updates
New team members
Project management
Project timeline
Data gathering
Data cleaning and prep
Initial NLP efforts
Image classification
Data Gathering¶
Data gathering¶
Finishing up elementvape
Data and code should be available in shared folder
Identified 30 other potential sites if needed (definitely more out
there)
Data cleaning¶
Sticking with previous structure from last time
Shared sample mipod data
NLP Updates¶
Numeric values performing well with regular expressions
i.e. puffs per device, e-liquid contents, price, etc.
Working on TFN/synthetic, nicotine-free, CBD/THC
Less success when testing with new data
Nicotine salts/freebase not being auto id’d as TFN
Multiple nicotine values not being picked up
Currently working on testing LLMs with fine-tuning
Screens will be next pass
Image Processing Updates¶
Image Processing¶
Have initial pass at “iced” and “screen”
Going to test with additional data
Seeing possibilities of distinguishing screen model
Seeing if possible to improve performance and speed
Working with a pre-existing trained model to find images of vapes (to
filter out non-vapes, parts etc.) to filter down data set
Iced¶
Screens¶