Find the PDF - Instructions
Find the PDF - Instructions
This system provides a web interface for contributors to find and upload PDF files for academic papers based on bibliography entries.
How it works
- User Assignment: Each user gets a unique fingerprint based on their browser characteristics
- Random Entry: Users are assigned a random bibliography entry from the BibTeX file
- 20-minute Lock: Once assigned, the entry is locked for 20 minutes to prevent duplicate assignments
- File Upload: Users can drag-and-drop or browse for PDF files
- GitHub Integration: Files are uploaded directly to your GitHub repository using the GitHub API
- Automatic Naming: Uploaded PDFs are renamed to match the bibliography key and committed to the repo
Setup Instructions
For Repository Owner (One-time setup)
- Create a GitHub Personal Access Token:
- Go to GitHub Settings → Personal Access Tokens
- Click “Generate new token (classic)”
- Give it a name like “FindPDF Upload”
- Select the “repo” scope (full control of repositories)
- Click “Generate token”
- Important: Copy the token immediately (you won’t see it again)
- Configure the Token:
- Visit your
/findthepdf/page - Click the “⚙️ Setup Token” button
- Paste your token and click “Save Token”
- The button should turn green with “✅ Token Configured”
- Visit your
For Contributors (Users uploading PDFs)
No setup required! The token is already configured by the repository owner.
Features
- Hidden Page: Accessible only at
/findthepdf/(not linked in navigation) - Responsive Design: Works on desktop and mobile devices
- File Validation: Only accepts PDF files up to 50MB
- GitHub API Integration: Direct upload to repository (completely free!)
- Progress Tracking: Visual feedback during upload process
- Token Management: Secure GitHub authentication system
- Next Button: Users can get a new assignment after completing one
File Structure
files/findapdf/
├── findapdf_tofind.bib # Source bibliography file
├── PDFs/ # Directory for uploaded PDF files (auto-created)
├── upload.php # Legacy server-side script (not needed with GitHub API)
└── README.md # This file
GitHub API Benefits
✅ Completely Free: Uses GitHub’s free API
✅ No Server Required: Works with static GitHub Pages
✅ Automatic Version Control: All uploads are tracked in git history
✅ Secure: Uses GitHub’s authentication system
✅ Reliable: Built on GitHub’s infrastructure
Access
The page is accessible at: https://kilianrouge.github.io/findthepdf/
Technical Notes
GitHub API Integration
The system now uses GitHub’s API to upload files directly to your repository:
- Authentication: Uses GitHub Personal Access Tokens
- File Storage: PDFs are committed directly to
files/findapdf/PDFs/ - Naming: Files are automatically renamed to
{bibkey}.pdf - Version Control: All uploads create git commits with descriptive messages
- Security: Tokens are stored locally in browser localStorage
No Server Required!
This solution is completely free and requires no additional hosting:
- ✅ Works with GitHub Pages (static hosting)
- ✅ Uses GitHub’s free API (no rate limits for typical usage)
- ✅ No PHP/server-side code needed
- ✅ No monthly fees or external services
Browser Fingerprinting
The system uses a combination of browser characteristics to create a unique identifier:
- User agent string
- Language settings
- Screen resolution
- Timezone offset
- Canvas fingerprint
This is more reliable than IP addresses for client-side applications and works across different networks.
Data Persistence
User assignments are stored in localStorage, which means:
- Data persists across browser sessions
- Data is isolated per domain
- Users can clear their assignment by clearing browser data
- No server-side database required
Bibliography Format
The system expects BibTeX entries with these fields:
title: Paper titleauthor: Author(s)date: Publication date/yeardoi: Digital Object Identifier (optional)
Example entry:
@article{ExampleKey2024,
title = {Example Paper Title},
author = {Author Name},
date = {2024},
doi = {10.1000/example},
}
Customization
You can customize the appearance by modifying the CSS in findthepdf.html:
- Colors and styling
- Upload area appearance
- Button styles
- Layout and spacing
Security Considerations
- File type validation (PDF only)
- File size limits (50MB maximum)
- Filename sanitization (bibkey-based naming)
- No executable file uploads
- Client-side validation with server-side verification (when using PHP script)
Troubleshooting
Common Issues
- Bibliography not loading: Check that the BibTeX file path is correct
- Same entry appearing: Clear browser localStorage or wait 20 minutes
- Upload not working: This is expected on GitHub Pages (static site)
- File not accepted: Ensure file is PDF format and under 50MB
Browser Support
- Modern browsers with JavaScript enabled
- File API support required for drag-and-drop
- localStorage support required for user tracking
- Canvas API for fingerprinting
Future Enhancements
Possible improvements:
- Admin interface to view upload statistics
- Progress tracking across all entries
- Email notifications for new uploads
- Integration with reference management systems
- Automatic DOI validation
- Duplicate detection
