Use the GLM-OCR model to parse the layout of documents and images and extract text content. Support OCR recognition of images and PDF documents, returning detailed layout information and visualization results.
Model code: glm-ocr
glm-ocr "glm-ocr"
Image or PDF document to be recognized, supports URL and base64. Supported image formats: PDF, JPG, PNG. Single image β€10MB, PDF β€50MB, maximum support 100 pages
"https://cdn.bigmodel.cn/static/logo/introduction.png"
Whether to return screenshot information
Whether to return detailed layout image result information
Start page number for parsing when PDF is provided
x >= 1End page number for parsing when PDF is provided
x >= 1Unique request identifier, automatically generated if not provided
"req_123456789"
End user ID for abuse monitoring. Length: 6-128 characters
6 - 128"user_123456"
Business processing successful
Task ID
"task_123456789"
Request creation time, Unix timestamp in seconds
1727156815
Model name
"GLM-OCR"
Recognition result in Markdown format
"# Doc title\nThis is the document content..."
Detailed layout information
Recognition result image URLs
Document basic information
Token usage statistics returned when the model call ends.
Request ID
"req_123456789"