Korosuke Part 2: From Concept to Research Paper
Back in October, we introduced Korosuke β our hardware-based prototype designed to securely and efficiently access organizational knowledge without relying on the cloud. That initial post focused on the motivation, technical foundation, and our early-stage implementation.
Months of iteration, research, and development later, Korosuke evolved into a robust, security-aware retrieval framework. We presented this matured version at the 9th Research Day of SRM University-AP β and received the Gold Award for Best Research in Computer Science.
π§ From Idea to Innovation
The original goal was to create a secure, AI-powered system capable of retrieving internal data while preserving privacy and reducing cloud dependency. As the system evolved, we realized that effective data access in enterprise environments requires more than just retrieval. It requires context awareness, user role-based filtering, and built-in security enforcement β not just post-retrieval sanitization.
This led to the design of a comprehensive Retrieval-Augmented Generation (RAG) framework that integrates security constraints directly into the retrieval and generation pipeline.
π The Core Research Problem
In enterprise environments, information retrieval systems must balance two conflicting needs:
Deliver relevant results
Ensure strict access control and compliance
Traditional RAG models prioritize semantic relevance but fail to enforce fine-grained access control. Korosuke addresses this by embedding security attributes directly into the document representation and applying adaptive filtering throughout the pipeline.
π The Korosuke Framework
Korosuke combines two key technical innovations to address the problem:
1. Hierarchical Security-Aware Embeddings
A modified embedding function:
E_sec(d, s) = g(f(d), s)
f(d)
: Standard semantic embedding of the documents
: Security metadata (e.g., classification level)g
: Transformation that fuses semantic meaning and access control into a single vector
This ensures that documents are semantically indexed and grouped by access level. Unauthorized users naturally see less-relevant (or no) results due to embedding-level separation.
2. Adaptive Re-Ranking Function
To prioritize both relevance and security, Korosuke introduces a utility function:
U(d, q, u) = Ξ± Β· R(d, q) + (1 - Ξ±) Β· S(d, u)
R(d, q)
: Relevance score of documentd
to queryq
S(d, u)
: Compatibility score between documentd
and useru
based on access rightsΞ±
: Tunable weight that adjusts the importance of relevance vs. security
This re-ranking ensures that even in ambiguous cases, the system favors secure content delivery tailored to the userβs role.
π§ͺ Real-World Deployment: Korosuke Chat
To evaluate the effectiveness of the system, we implemented Korosuke Chat β a conversational AI system deployed on SRM University-APβs hostel policy documents.
Documents were categorized into three tiers:
- Tier 1: Public (all students)
- Tier 2: Semi-restricted (council members, hostel reps)
- Tier 3: Confidential (administrators and wardens)
Students across these roles interacted with the system and provided feedback through structured and probe-based testing.
π Evaluation Results
- Response Accuracy: 91.2%
- Security Compliance: 98.4%
- User Satisfaction (Appropriateness): 4.7 / 5
- Information Leakage: 0%
- False Restriction Rate: 4.2%
- Query Understanding: 94.5%
- Average Response Time: 1.8 seconds
Students appreciated the transparency of access restrictions and the tailored explanations based on their roles.
π Recognition at SRMAP Research Day
Korosuke was presented at the 9th Research Day at SRM University-AP, where projects across engineering, sciences, and humanities were evaluated by a panel of academic and industry experts.
We demonstrated:
- A live working prototype of Korosuke Chat
- Architectural overview and theoretical model
- Security guarantees under formal threat models
- Quantitative benchmarks comparing Korosuke to baseline RAG and rule-based systems
Korosuke received the Gold Award for its combination of technical depth, practical application, and enterprise impact.
π¦ Roadmap and Future Work
The next development milestones for Korosuke include:
- Federated Retrieval: Enable secure knowledge sharing across organizations with differing access policies
- Mobile Access Support: Extend Korosuke Chat to authenticated mobile devices
- Formal Verification: Incorporate verifiable security proofs to strengthen compliance guarantees
- Domain Adaptation: Apply the Korosuke framework in regulated industries like healthcare, government, and finance
π Team Acknowledgments
Sachin β Magically wrangled RAG and reverse proxy while nobody understood what he was talking about. When asked to explain his work, just mumbled “vectors” and everyone nodded wisely.
Giridhar β Mostly contributed by creating problems then saying “wait, I think I have an idea” before disappearing.
Prakashita β Team’s official energy supplier. Paper contributions included adding commas.
Sushil β MVP development consisted of crying in various corners of the building. Research “writing” involved staring at a blank document while questioning life choices. Eventually cried so hard on stage that the judges felt bad and gave you points for “emotional commitment.”
Jayadhar β The actual hero who did everything while everyone else was busy having existential crises. Created the presentation while simultaneously fixing bugs, writing the paper, and providing therapy to Sushil.
We are thankful to SRM University-AP for providing the platform to carry out this research and recognize it with the Gold Award.
π Conclusion
Korosuke demonstrates that it is possible to blend contextual intelligence with robust access control in modern AI systems. By embedding security policies at the core of the retrieval architecture β rather than applying them as external filters β we created a framework that is both practical and principled.
The combination of strong empirical results, institutional deployment, and formal modeling shows that Korosuke is a viable foundation for secure enterprise knowledge systems.