Using tiny...not able to extract correctly...

#19
by padysrini - opened

I am trying to parse car price/miles/location. Sometimes it works. But many, it fails.

Input: 10k price 90k miles around houston
Output: {'miles': 2000, 'price': 1000, 'location': 'detroit'}

Input: 20k miles 10k price in detroit
Output: {'miles': 2000, 'price': 1000, 'location': 'detroit'}

Parts of the code -

prompt_template = """Extract miles, price, and location as JSON from the given text.
  • Normalize numbers by removing symbols like '$' and ','.
  • Interpret proximity words like "area", "near", or "around" as setting corresponding '...Around' flags to true.
  • If miles, price, or location are missing, return their value as null.
  • For miles and price, if a range is given, return "minMiles" and "maxMiles" or "minPrice" and "maxPrice" as integers.
  • For qualifiers like "up to", "below", "max", set "milesMax" or "priceMax" accordingly.
  • Normalize all numbers by removing currency symbols and commas.
  • Return the output strictly as a JSON object with keys: miles, milesAround, price, priceAround, location, locationAround, minMiles, maxMiles, minPrice, maxPrice, milesMax, priceMax.

Example 1

Input: 2k miles 1k price in detroit
Output: {{"miles": 2000, "price": 1000, "location": "detroit"}}

Example 2

Input: 5k price 15k miles boston
Output: {{"miles": 15000, "price": 5000, "location": "boston"}}

Input: "{0}"

Output:
"""

prompt = prompt_template.format(normalized_text)

result = nlp_pipe(prompt)[0]['generated_text']

Sign up or log in to comment