Open
Description
I'm getting an error here when using Gemini batch and am struggling to debug it. It seems to do with how curator is serializing the Gemini batch request to GCS?
Failed to import data. Please check 'Prepare input' section of the batch predictions documentation. For Claude models, see https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/claude-batch#prepare_input Invalid value: Error while reading data, error message: Failed to parse JSON: No active field found.; ParsedString returned false; Could not parse value; Could not parse value; Could not parse value; Could not parse value; Could not parse value; Could not parse value; Parser terminated before end of string File: requests_0.jsonl at [1:1]
This is a representative curator pass:
class HarmfulContentDetector(curator.LLM):
def prompt(self, row) -> List[dict]:
prompt = ("Does this image contain harmful content? Reply only Yes or No.")
image_part = {
"fileData": {
"mimeType": "image/jpeg",
"fileUri": row["image_uri"],
}
}
return [{"role": "user", "content": [prompt, image_part]}]
def parse(self, row, resp):
return {
"image_uri": row["image_uri"],
"boxed": (resp or "").strip() or "ERROR",
}
I'm passing direct gs:// URIs to this function.