Skip to content
\n

Then, I annotated the dataset as follows:

\n
from deepdoctection.datapoint.view import IMAGE_ANNOTATION_TO_LAYOUTS, Layout\nIMAGE_ANNOTATION_TO_LAYOUTS.update({i: Layout for i in HJExtension})\n
\n
After this, I proceeded to add it to the layout service and perform recognition on the layout:\n\npath_weights = dd.ModelCatalog.get_full_path_weights(\"/kaggle/temp/model_final.pth\")\npath_config = dd.ModelCatalog.get_full_path_configs(\"/kaggle/temp/model_final.pth\")\ncategories = dd.ModelCatalog.get_profile(\"/kaggle/temp/model_final.pth\").categories\n\nd2_detector = dd.D2FrcnnDetector(path_config, path_weights, categories, config_overwrite=[\"NMS_THRESH_CLASS_AGNOSTIC=0.8\", \"MODEL.ROI_HEADS.SCORE_THRESH_TEST=0.1\"])\nimage_layout = dd.ImageLayoutService(d2_detector)\n\npage_parser = dd.PageParsingService(text_container=dd.LayoutType.word, floating_text_block_categories=[layout_item for layout_item in HJExtension])\npipe = dd.DoctectionPipe([image_layout], page_parsing_service=page_parser)\n\ndf = pipe.analyze(path=\"/kaggle/input/japanfileimage/file1\")\ndf.reset_state()\n\ndf_iter = iter(df)\ndp = next(df_iter)\n\nimage = dp.viz()\n\nplt.figure(figsize = (25,17))\nplt.axis('off')\nplt.imshow(image)\n
\n

However, there appears to be an issue with the category naming, which is causing the 'KeyError' for '7' in this context

\n
`---------------------------------------------------------------------------\nKeyError                                  Traceback (most recent call last)\nCell In[8], line 6\n      3 df.reset_state()\n      5 df_iter = iter(df)\n----> 6 dp = next(df_iter)\n      8 image = dp.viz()\n     10 plt.figure(figsize = (25,17))\n\nFile /opt/conda/lib/python3.10/site-packages/deepdoctection/dataflow/common.py:109, in MapData.__iter__(self)\n    108 def __iter__(self) -> Iterator[Any]:\n--> 109     for dp in self.df:\n    110         ret = self.func(copy(dp))  # shallow copy the list\n    111         if ret is not None:\n\nFile /opt/conda/lib/python3.10/site-packages/deepdoctection/dataflow/common.py:110, in MapData.__iter__(self)\n    108 def __iter__(self) -> Iterator[Any]:\n    109     for dp in self.df:\n--> 110         ret = self.func(copy(dp))  # shallow copy the list\n    111         if ret is not None:\n    112             yield ret\n\nFile /opt/conda/lib/python3.10/site-packages/deepdoctection/pipe/base.py:93, in PipelineComponent.pass_datapoint(self, dp)\n     91     with timed_operation(self.__class__.__name__):\n     92         self.dp_manager.datapoint = dp\n---> 93         self.serve(dp)\n     94 else:\n     95     self.dp_manager.datapoint = dp\n\nFile /opt/conda/lib/python3.10/site-packages/deepdoctection/pipe/layout.py:86, in ImageLayoutService.serve(self, dp)\n     84 if self.padder:\n     85     np_image = self.padder.apply_image(np_image)\n---> 86 detect_result_list = self.predictor.predict(np_image)  # type: ignore\n     87 if self.padder and detect_result_list:\n     88     boxes = np.array([detect_result.box for detect_result in detect_result_list])\n\nFile /opt/conda/lib/python3.10/site-packages/deepdoctection/extern/d2detect.py:260, in D2FrcnnDetector.predict(self, np_img)\n    248 \"\"\"\n    249 Prediction per image.\n    250 \n    251 :param np_img: image as numpy array\n    252 :return: A list of DetectionResult\n    253 \"\"\"\n    254 detection_results = d2_predict_image(\n    255     np_img,\n    256     self.d2_predictor,\n    257     self.resizer,\n    258     self.cfg.NMS_THRESH_CLASS_AGNOSTIC,\n    259 )\n--> 260 return self._map_category_names(detection_results)\n\nFile /opt/conda/lib/python3.10/site-packages/deepdoctection/extern/d2detect.py:271, in D2FrcnnDetector._map_category_names(self, detection_results)\n    269 filtered_detection_result: List[DetectionResult] = []\n    270 for result in detection_results:\n--> 271     result.class_name = self._categories_d2[str(result.class_id)]\n    272     if isinstance(result.class_id, int):\n    273         result.class_id += 1\n\nKeyError: '7'\n
\n

You can check here for more info: Kaggle deepdoctection

\n

The catalog of layout parser:
\nlayoutparser catalog

\n

Tutorial link:
\nTutorial link

","upvoteCount":1,"answerCount":1,"acceptedAnswer":{"@type":"Answer","text":"

Yes, this is the stdout I was referring to. The model has a dense layer with 9 classes (8 + 1 background that will always be added).

\n

So, it looks that there is one category missing.

\n

As there is only this warning I doubt that there is an issue with the model architecture itself. There only seem to be one category missing.

","upvoteCount":0,"url":"https://github.com/deepdoctection/deepdoctection/discussions/234#discussioncomment-7237077"}}}

KeyError Issue with HJDataset from LayoutParser #234

Answered by JaMe76
khanhthanhh9 asked this question in Q&A
Discussion options

You must be logged in to vote

Yes, this is the stdout I was referring to. The model has a dense layer with 9 classes (8 + 1 background that will always be added).

So, it looks that there is one category missing.

As there is only this warning I doubt that there is an issue with the model architecture itself. There only seem to be one category missing.

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@khanhthanhh9
Comment options

@JaMe76
Comment options

Answer selected by khanhthanhh9
@khanhthanhh9
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants