
Fraudulent Email Classifier
Technologies Used:
-
Backend: PHP / Java / Node.js
-
Frontend: HTML, CSS, Bootstrap, JavaScript
-
ML Tools: Python, Scikit-learn, Pandas, NLTK / spaCy
-
Database: MySQL / MongoDB
-
Optional APIs: Gmail API / IMAP (to fetch emails)
Project Objective:
To develop a machine learning-based web application that classifies emails as fraudulent (phishing/spam) or genuine, helping users and organizations prevent cyber-attacks and data breaches.
Key Features:
-
Email Content Upload:
Users can manually paste email content or connect their email account (optional) to scan received messages. -
Fraud Detection Engine:
An ML classifier identifies fraud indicators in the email’s subject, body, and metadata. -
Highlight Suspicious Elements:
The system highlights links, keywords, or patterns typically found in phishing/scam emails. -
Real-time Classification:
Instant output labeling an email as:-
Genuine
-
Suspicious
-
Fraudulent
-
-
Admin Panel:
Admins can view flagged emails, model accuracy, and user-reported spam. -
Feedback Loop:
Users can mark wrongly classified emails to retrain and improve the model.
Dataset for Training:
Use publicly available datasets such as:
-
SpamAssassin Public Corpus
-
Kaggle’s "Spam Email" datasets
Features include: -
Subject line
-
Email body text
-
Sender domain
-
Links and attachments
-
Word frequency and pattern usage