Rezwan Corpus
An Ocean of Hadith in Your Hands
The most comprehensive intelligent database of Shia and Sunni hadith, enriched with the power of artificial intelligence.
Hadith Research
An Old Challenge A Modern Solution
Rezwan Corpus transforms traditional research barriers into opportunities for deeper knowledge discovery.
Traditional Challenges
- Vast volume and dispersion of sources
- Difficulty in examining chains of transmission (sanad)
- Time-consuming and prone to human error
- Lack of unified and searchable access
The Rezwan Solution
- Fast and unified access to millions of narrations
- Automated analysis and comparative comparison
- Knowledge generation with unprecedented speed and accuracy
- Discovering hidden semantic relationships in texts
Unprecedented Scale
An Ocean of Hadith Data
The power of any analysis depends on the comprehensiveness of its data. Rezwan provides the most complete hadith dataset.
Comprehensive Coverage
Access to books and hundreds of authentic Shia and Sunni sources in a unified platform.
Comparative Research
Enables the study and comparison of narrations between different Islamic schools of thought, which was previously very difficult.
Macro-Level Analysis
Identify recurring patterns, study the evolution of a concept throughout history, and discover intertextual relationships.
Continuous Updates
This corpus is periodically updated every three months with new sources and analyses.
Data Source and Corpus Generation Process
Transparency about data sources and production methods is the foundation of scientific trust.
Primary Data Source
For this purpose, the data from the Ahl al-Bayt Library software was used as the raw text, and numerous efforts were made to produce this corpus using various intelligent and computational methods.
As the goal was to create a large hadith corpus based on the Ahl al-Bayt Library software, all books in the hadith sources category were selected. Additionally, books from all other subject categories up to the end of the 5th century AH, and for the lexicography category, books up to the end of the 3rd century AH, were chosen as raw text. Based on the selected texts, we tried to extract hadiths from anywhere possible within these texts using various AI and machine learning methods to create a large collection of hadiths from both schools (Shia and Sunni).
Important Note
Although this process was designed and implemented with the presence of hadith studies specialists, its outputs have not been fully reviewed or confirmed by experts from seminaries and universities. Therefore, some results may contain errors or incorrect content.
Corpus Statistics
The statistics of the hadiths extracted from the sources are as follows:
- Hadith corpus based on Ahl al-Bayt Library: 1289 books, 1394080 hadiths
- Hadith Books: 601 books (984004 hadiths)
- Historical Books: 264 books (188836 hadiths)
- Quranic Studies: 94 books (110752 hadiths)
- Fiqh Books: 61 books (59204 hadiths)
- Ethical Books: 74 books (47387 hadiths)
- Theological Books: 88 books (25292 hadiths)
- Arabic Literature: 67 books (11870 hadiths)
- Supplications & Visitations: 15 books (5590 hadiths)
- Principles of Fiqh: 9 books (964 hadiths)
- Logic & Philosophy: 6 books (740 hadiths)
Beyond Words
Discovering Semantic and Thematic Relationships
The unique ability of Rezwan Corpus lies in its deep understanding of hadith content.
Semantic Vectors (Embeddings)
In Rezwan, you can examine the semantic and thematic connections of a narration, even if different words are used in it.
Summarization and Content Analysis
With a high accuracy of 9.34 out of 10, our algorithms can extract the essence of a long narration and provide a precise summary.
Intelligent Data Enrichment
Each narration in this corpus is enriched with valuable layers of information to transform raw data into usable knowledge.
Separating Sanad and Matn
Intelligent separation of the chain of narrators from the main content of the narration.
Translation of Sanad and Matn
Providing fluent and accurate translations into various languages.
Diacritization
Accurate diacritization of Arabic words for correct pronunciation.
Word-by-Word Translation
Displaying the equivalent meaning of each word for lexical analysis.
Narration Summary
Generating a short and clear summary of the hadith's main message.
Narration Goals
Extracting the main objective that the hadith seeks to explain.
Narration Topics
Assigning key topics for categorization and search.
Commentary and Explanation
Providing detailed explanations to clarify complex concepts.
Quranic References
Identifying and linking related verses from the Holy Quran.
Biblical References
Discovering content connections with Old and New Testament texts.
Lexically Similar Hadiths
Finding narrations that are similar in terms of word structure and phrases.
Semantically Similar Hadiths
Identifying narrations that convey the same message and concept.
Proven Quality and Practical Applications
Rezwan Corpus has been evaluated by a team of prominent hadith science specialists.
Researchers
Conducting innovative research in a short time
Academics
A powerful tool for teaching and research
Research Centers
Developing large-scale data-driven projects
Seminaries
Transforming traditional research and teaching methods
Easy Access for Everyone
We have designed two main pathways to leverage the power of Rezwan Corpus.
For Researchers and General Users
Utilizing the Rezwan Corpus is accessible through the Thiqat software; a system with biographical data, advanced searches, various filters, semantic network displays, and more.
Enter Thiqat SoftwareFor Developers and Organizations (API)
Secure and documented APIs for fetching texts, searching, and utilizing other information available in Rezwan for integration into existing systems.
View API Docs (Coming Soon)License and Data Usage Terms
We believe that knowledge should be freely available while its intellectual rights are preserved.
Creative Commons Attribution–ShareAlike (CC BY-SA 4.0)
Attribution-ShareAlike
This license gives you a lot of freedom, but it has one important condition: if you use data from Rezwan Corpus and create an improved or derivative version, you must also release that new product under the same license (CC BY-SA 4.0).
| Condition | Description | Symbol |
|---|---|---|
| Freedom of Use | The data is free to use for any purpose (commercial and non-commercial). | |
| Attribution | When using, you must credit "Rezwan Corpus" as the source. | |
| Share-Alike | Any new dataset created based on our data must be released under the same license to continue the cycle of free knowledge. |
برای ارجاعدهی علمی به این پیکره، میتوانید از مقاله زیر استفاده کنید:
Asgari-Bidhendi, M., Ghaseminia, M. A., Shahbazi A., Hossayni S. A., Torabian N., Minaei-Bidgoli B. (2025). Rezwan: Leveraging Large Language Models for Comprehensive Hadith Text Processing: A 1.2M Corpus Development
Request Corpus Access
برای آشنایی و دریافت پیکره، دو مسیر پیش روی شماست: میتوانید ابتدا نمونههای داده را بررسی کنید یا مستقیماً برای دریافت پیکره کامل اقدام نمایید.