My research activities deal with non-constrained writing recognition and document analysis
and indexing.
-
In handwriting recognition, two types of models were studied: generative and discriminative.
For generative models, we tried to unify the mechanisms of recognition used in probabilistic Markov models. For this, the terms
appearing in the Bayes formula are decomposed in different ways according to the dimensionality of the pattern and the dependence
of assumptions on the pattern and between sub-patterns and labels. We have distinguished two major cases of decomposition: the shape
relatively to the label and the shape relatively to the model. The former being more specific to the lexical recognition with
1D HMM (J. Anigbogu PhD). The second case knows several developments based on the dimensionality of the model and the interpretation
of the likelihood of the 2D shape. Assay results on this axis, being generally correlated by another HMM (planar HMM, N. Ben Amara
PhD) and for causal random fields. We developed a 2D system suitable for handwriting recognition coupling a local vision based on
a Markov random field (MRF) and a global vision by a hidden Markov model (NSHP-HMM, G. Saon PhD). The advantage of this model lies in the
revaluation MRF at the pixel level without considering complex procedures of primitive extraction. We then proposed an extension
of this model to analytic recognition without segmentation (S. Vajda PhD) and with letters based approach (Ch. Choisy PhD).
In the discriminative models, we studied neural models with adaptive convolution layer topologies for deformation handling
inside the NN layers (H. Cecotti PhD). We also studied cognitive models for word reading. We developped a transparent NN
without training, running through excitations-inhibitions ( S. Maddouri PhD). This model was used for Arabic large vocabulary
recognition (Y. Ben Cheikh PhD). Improvements have been made in rehabilitating learning, reducing the data space and integrating
the time component for correction during training (Y. Rangoni PhD).
-
In Document analysis field, the methods are essentially related to the classification and structure extraction.
For classification, we have developed incremental clustering algorithms. To analyze the behavior of the various defined classifiers,
we have set up an "observatory results" including tools quality measures classifiers and correction by Case base Reasoning
( H. Hamza PhD). As part of active learning, we proposed a measure of uncertainty that characterizes the importance of data
and improves the performance of active learning over existing measures (R. Bouguélia PhD).
For metadata extraction, the systems studied are of multi-agent type, with opportunistic reasoning (Y. Chenevoy PhD).
Multiple knowledge weighted by degree of confidence in the quality of the documents are organized in an a priori model
and interpreted as segmentation assumptions (T. Akindele PhD).
For the recognition of fine structures, such as references or tables of contents, we developed a linguistic model
highlighting the information content of text fields. The assumption of the model is that the parts of speech built
around the name (or noun phrases) are those who carry references to objects in the universe of discourse (F. Parmentier PhD).
Furthermore, we treated the problem of named entity recognition in images from a document structuring entities predefined database.
The objective is to provide a matching approach between document and database for the identification of entities and their location
in the document (N. Kooli PhD). Several other works were undertaken on the analysis of documents, such as document stream segmentation
(H. Daher Postdoc ), table extraction (Santoch KC Postodc ), segmentation of chemistry
notebooks (N. Ghanmi PhD) or mathematical formula interpretation (A. Kacem PhD).
-
My research group is taking part at different european and national projects on document analysis and retrieval.
For more details, see READ
Home Page