Skip to content

[Issue]: Object of type ModelMetaclass is not JSON serializable.<title> #1884

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 of 3 tasks
bodhiiiii opened this issue Apr 16, 2025 · 1 comment
Open
1 of 3 tasks
Labels
triage Default label assignment, indicates new issue needs reviewed by a maintainer

Comments

@bodhiiiii
Copy link

Do you need to file an issue?

  • I have searched the existing issues and this bug is not already filed.
  • My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
  • I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.

Describe the issue

An error occurs during the index building phase: Object of type ModelMetaclass is not JSON serializable.
Global search will still report an error even if successful, while local search cannot succeed.

Image

Steps to reproduce

pip install poetry
pip install graphrag
poetry install
graphrag init --root .
poetry run poe index --root .

GraphRAG Config Used

### This config file contains required core defaults that must be set, along with a handful of common optional settings.
### For a full list of available settings, see https://microsoft.github.io/graphrag/config/yaml/

### LLM settings ###
## There are a number of settings to tune the threading and token limits for LLM calls - check the docs.

models:
  default_chat_model:
    type: openai_chat  
    api_base: https://dashscope.aliyuncs.com/compatible-mode/v1
    api_version: ""  
    auth_type: api_key
    api_key: ${ALIYUN_API_KEY}  
    model: qwen-max-latest
    encoding_model: cl100k_base  
    model_supports_json: true
    concurrent_requests: 25
    async_mode: threaded
    retry_strategy: native
    max_retries: -1
    tokens_per_minute: 0
    requests_per_minute: 0

  default_embedding_model:
    type: openai_embedding  
    api_base: https://dashscope.aliyuncs.com/compatible-mode/v1
    auth_type: api_key
    api_key: ${ALIYUN_API_KEY}
    model: text-embedding-v3
    encoding_model: cl100k_base  
    model_supports_json: true
    concurrent_requests: 25
    async_mode: threaded
    retry_strategy: native
    max_retries: -1
    tokens_per_minute: 0
    requests_per_minute: 0

vector_store:
  default_vector_store:
    type: lancedb
    db_uri: output\lancedb
    container_name: default
    overwrite: True

embed_text:
  model_id: default_embedding_model
  vector_store_id: default_vector_store

### Input settings ###

input:
  type: file # or blob
  file_type: text # [csv, text, json]
  base_dir: "input"

chunks:
  size: 1200
  overlap: 100
  group_by_columns: [id]

### Output settings ###
## If blob storage is specified in the following four sections,
## connection_string and container_name must be provided

cache:
  type: file # [file, blob, cosmosdb]
  base_dir: "cache"

reporting:
  type: file # [file, blob, cosmosdb]
  base_dir: "logs"

output:
  type: file # [file, blob, cosmosdb]
  base_dir: "output"

### Workflow settings ###

extract_graph:
  model_id: default_chat_model
  prompt: "prompts/extract_graph.txt"
  entity_types: [organization,person,geo,event]
  max_gleanings: 1

summarize_descriptions:
  model_id: default_chat_model
  prompt: "prompts/summarize_descriptions.txt"
  max_length: 500

extract_graph_nlp:
  text_analyzer:
    extractor_type: regex_english # [regex_english, syntactic_parser, cfg]

extract_claims:
  enabled: false
  model_id: default_chat_model
  prompt: "prompts/extract_claims.txt"
  description: "Any claims or facts that could be relevant to information discovery."
  max_gleanings: 1

community_reports:
  model_id: default_chat_model
  graph_prompt: "prompts/community_report_graph.txt"
  text_prompt: "prompts/community_report_text.txt"
  max_length: 2000
  max_input_length: 8000

cluster_graph:
  max_cluster_size: 10

embed_graph:
  enabled: false # if true, will generate node2vec embeddings for nodes

umap:
  enabled: false # if true, will generate UMAP embeddings for nodes (embed_graph must also be enabled)

snapshots:
  graphml: false
  embeddings: false

### Query settings ###
## The prompt locations are required here, but each search method has a number of optional knobs that can be tuned.
## See the config docs: https://microsoft.github.io/graphrag/config/yaml/#query

local_search:
  chat_model_id: default_chat_model
  embedding_model_id: default_embedding_model
  prompt: "prompts/local_search_system_prompt.txt"

global_search:
  chat_model_id: default_chat_model
  map_prompt: "prompts/global_search_map_system_prompt.txt"
  reduce_prompt: "prompts/global_search_reduce_system_prompt.txt"
  knowledge_prompt: "prompts/global_search_knowledge_system_prompt.txt"

drift_search:
  chat_model_id: default_chat_model
  embedding_model_id: default_embedding_model
  prompt: "prompts/drift_search_system_prompt.txt"
  reduce_prompt: "prompts/drift_search_reduce_prompt.txt"

basic_search:
  chat_model_id: default_chat_model
  embedding_model_id: default_embedding_model
  prompt: "prompts/basic_search_system_prompt.txt"

Logs and screenshots

{
"type": "error",
"data": "Community Report Extraction Error",
"stack": "Traceback (most recent call last):\n File "D:\Tool\Anaconda\envs\graphrag\lib\site-packages\fnllm\base\services\json.py", line 126, in _parse_json_string\n return json.loads(value) if value else None\n File "D:\Tool\Anaconda\envs\graphrag\lib\json\init.py", line 346, in loads\n return _default_decoder.decode(s)\n File "D:\Tool\Anaconda\envs\graphrag\lib\json\decoder.py", line 337, in decode\n obj, end = self.raw_decode(s, idx=_w(s, 0).end())\n File "D:\Tool\Anaconda\envs\graphrag\lib\json\decoder.py", line 353, in raw_decode\n obj, end = self.scan_once(s, idx)\njson.decoder.JSONDecodeError: Unterminated string starting at: line 478 column 14 (char 9368)\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n File "D:\Tool\Anaconda\envs\graphrag\lib\site-packages\fnllm\base\services\json.py", line 96, in invoke_json\n return await self.try_receive_json(delegate, prompt, kwargs)\n File "D:\Tool\Anaconda\envs\graphrag\lib\site-packages\fnllm\base\services\json.py", line 114, in try_receive_json\n raw_json = self._parse_json_string(json_string)\n File "D:\Tool\Anaconda\envs\graphrag\lib\site-packages\fnllm\base\services\json.py", line 129, in _parse_json_string\n raise FailedToGenerateValidJsonError(msg) from err\nfnllm.base.services.errors.FailedToGenerateValidJsonError: JSON response is not a valid JSON, response={ \n "title": "中国五人制足球联赛分析:上海队、青岛队与内蒙古队", \n "summary": "该社区围绕中国五人制足球联赛展开,聚焦于上海队、青岛队和内蒙古队的表现及其技术特点。这些球队在2017-2018赛季中展现了不同的进攻组织能力和战术风格,尤其是在最后一传和射门技术上的表现。五人制足球比赛以其快速的攻守转换和多样化的射门技术为特点,与传统十一人制足球比赛形成鲜明对比。社区内还涉及射门技术分布的详细分析,如脚内侧射门、脚背正面射门等。整体来看,这些球队和比赛形式共同构成了一个技术驱动且战术多样的足球生态系统。", \n "rating": 6.5, \n "rating_explanation": "影响严重性评分为中等偏高,主要因为这些球队在五人制足球联赛中的技术特点和战术执行能力可能对比赛结果和联赛发展产生显著影响。", \n "findings": [ \n { \n \t\t},\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings": [ \n { \n },\n "struct"\n ],"findings.\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n File "D:\Tool\Anaconda\envs\graphrag\lib\site-packages\fnllm\base\base_llm.py", line 144, in call\n return await self._decorated_target(prompt, **kwargs)\n File "D:\Tool\Anaconda\envs\graphrag\lib\site-packages\fnllm\base\services\json.py", line 77, in invoke\n return await this.invoke_json(delegate, prompt, kwargs)\n File "D:\Tool\Anaconda\envs\graphrag\lib\site-packages\fnllm\base\services\json.py", line 100, in invoke_json\n raise FailedToGenerateValidJsonError from error\nfnllm.base.services.errors.FailedToGenerateValidJsonError\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File "D:\Code\graphrag\graphrag\index\operations\summarize_communities\community_reports_extractor.py", line 82, in call\n response = await self._model.achat(\n File "D:\Code\graphrag\graphrag\language_model\providers\fnllm\models.py", line 81, in achat\n response = await self.model(prompt, **kwargs)\n File "D:\Tool\Anaconda\envs\graphrag\lib\site-packages\fnllm\openai\llm\openai_chat_llm.py", line 94, in call\n return await self._text_chat_llm(prompt, **kwargs)\n File "D:\Tool\Anaconda\envs\graphrag\lib\site-packages\fnllm\openai\services\openai_tools_parsing.py", line 130, in call\n return await self._delegate(prompt, **kwargs)\n File "D:\Tool\Anaconda\envs\graphrag\lib\site-packages\fnllm\base\base_llm.py", line 148, in call\n await self._events.on_error(\n File "D:\Code\graphrag\graphrag\language_model\providers\fnllm\events.py", line 26, in on_error\n self._on_error(error, traceback, arguments)\n File "D:\Code\graphrag\graphrag\language_model\providers\fnllm\utils.py", line 45, in on_error\n callbacks.error("Error Invoking LLM", error, stack, details)\n File "D:\Code\graphrag\graphrag\callbacks\workflow_callbacks_manager.py", line 64, in error\n callback.error(message, cause, stack, details)\n File "D:\Code\graphrag\graphrag\callbacks\file_workflow_callbacks.py", line 47, in error\n json.dumps(\n File "D:\Tool\Anaconda\envs\graphrag\lib\json\init.py", line 238, in dumps\n **kw).encode(obj)\n File "D:\Tool\Anaconda\envs\graphrag\lib\json\encoder.py", line 201, in encode\n chunks = list(chunks)\n File "D:\Tool\Anaconda\envs\graphrag\lib\json\encoder.py", line 431, in _iterencode\n yield from _iterencode_dict(o, _current_indent_level)\n File "D:\Tool\Anaconda\envs\graphrag\lib\json\encoder.py", line 405, in _iterencode_dict\n yield from chunks\n File "D:\Tool\Anaconda\envs\graphrag\lib\json\encoder.py", line 405, in _iterencode_dict\n yield from chunks\n File "D:\Tool\Anaconda\envs\graphrag\lib\json\encoder.py", line 405, in _iterencode_dict\n yield from chunks\n File "D:\Tool\Anaconda\envs\graphrag\lib\json\encoder.py", line 438, in _iterencode\n o = _default(o)\n File "D:\Tool\Anaconda\envs\graphrag\lib\json\encoder.py", line 179, in default\n raise TypeError(f'Object of type {o.class.name} '\nTypeError: Object of type ModelMetaclass is not JSON serializable\n",
"source": "

indexing-engine.log
logs.json

",
"details": null
}

Image
15:18:20,985 graphrag.index.operations.summarize_communities.community_reports_extractor ERROR error generating community report
Traceback (most recent call last):
File "D:\Tool\Anaconda\envs\graphrag\lib\site-packages\fnllm\base\services\json.py", line 139, in _read_model_from_json
return json_model.model_validate(value)
File "D:\Tool\Anaconda\envs\graphrag\lib\site-packages\pydantic\main.py", line 627, in model_validate
return cls.pydantic_validator.validate_python(
pydantic_core._pydantic_core.ValidationError: 4 validation errors for CommunityReportResponse
findings.0.summary
Field required [type=missing, input_value={}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.10/v/missing
findings.0.explanation
Field required [type=missing, input_value={}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.10/v/missing
findings.1.summary
Field required [type=missing, input_value={}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.10/v/missing
findings.1.explanation
Field required [type=missing, input_value={}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.10/v/missing

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "D:\Tool\Anaconda\envs\graphrag\lib\site-packages\fnllm\base\services\json.py", line 96, in invoke_json
return await self.try_receive_json(delegate, prompt, kwargs)
File "D:\Tool\Anaconda\envs\graphrag\lib\site-packages\fnllm\base\services\json.py", line 115, in try_receive_json
model = self._read_model_from_json(raw_json, json_model)
File "D:\Tool\Anaconda\envs\graphrag\lib\site-packages\fnllm\base\services\json.py", line 142, in _read_model_from_json
raise FailedToGenerateValidJsonError(msg) from err
fnllm.base.services.errors.FailedToGenerateValidJsonError: JSON response does not match the expected model, response={'title': ' ż ', 'summary': ' Χ е Ҫ ż ڲ š ű źͽű ڲ չ Щ ֱ ռ 46% 26% 16% ֳ ļ ʹ ÷ֲ ڲ  õļ ű źͽű ڲ ֱ ڶ ͵ ', 'rating': 3.0, 'rating_explanation': ' ż ڱ Ҫ Խϸߣ Ӱ Ϊ ԣ Ҫ ڼ ֵ ͳ ϡ ', 'findings': [{}, {}]}.

Additional Information

  • GraphRAG Version:
  • Operating System:windows 10
  • Python Version:3.10
  • Related Issues:
@bodhiiiii bodhiiiii added the triage Default label assignment, indicates new issue needs reviewed by a maintainer label Apr 16, 2025
@via007
Copy link

via007 commented Apr 18, 2025

提示词用tuning会有问题,用推理模型生成的会好点,阿里的不行

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Default label assignment, indicates new issue needs reviewed by a maintainer
Projects
None yet
Development

No branches or pull requests

2 participants