Using tiny...not able to extract correctly...
#19
by
padysrini
- opened
I am trying to parse car price/miles/location. Sometimes it works. But many, it fails.
Input: 10k price 90k miles around houston
Output: {'miles': 2000, 'price': 1000, 'location': 'detroit'}
Input: 20k miles 10k price in detroit
Output: {'miles': 2000, 'price': 1000, 'location': 'detroit'}
Parts of the code -
prompt_template = """Extract miles, price, and location as JSON from the given text.
- Normalize numbers by removing symbols like '$' and ','.
- Interpret proximity words like "area", "near", or "around" as setting corresponding '...Around' flags to true.
- If miles, price, or location are missing, return their value as null.
- For miles and price, if a range is given, return "minMiles" and "maxMiles" or "minPrice" and "maxPrice" as integers.
- For qualifiers like "up to", "below", "max", set "milesMax" or "priceMax" accordingly.
- Normalize all numbers by removing currency symbols and commas.
- Return the output strictly as a JSON object with keys: miles, milesAround, price, priceAround, location, locationAround, minMiles, maxMiles, minPrice, maxPrice, milesMax, priceMax.
Example 1
Input: 2k miles 1k price in detroit
Output: {{"miles": 2000, "price": 1000, "location": "detroit"}}
Example 2
Input: 5k price 15k miles boston
Output: {{"miles": 15000, "price": 5000, "location": "boston"}}
Input: "{0}"
Output:
"""
prompt = prompt_template.format(normalized_text)
result = nlp_pipe(prompt)[0]['generated_text']