Shih Peng Wen

I am a Python Developer
with expertise in Data Science
, Natural Language Processing
and Testing
.
Currently, I work as a Data Analysis Engineer
at GoMore.
personal projects
- NDLTD TW Papers Graph
- This project uses Vue.js for the frontend and FastAPI for the backend. It utilizes Opensearch as vector database.
- This project employs GitHub Actions for automating the CI/CD process and hosted on AWS.
- Uses AWS Lambda as web scraper to gather data, using Sentence-Transformer to understand the meaning of words, and uses KNN to find articles that are similar to each other.
- Inpired by: Keyword Analysis (GroundAI)
- https://ndltd-tw-papers-graph.wspooong.com
- LY Transcription
- LY Transcription helps you download Legislative Gazette records from the Legislative Yuan(立法院) and convert them into JSON files.
- Since the files are in DOC format, and Python currently supports processing Word files in DOCX format, the files need to be converted from DOC to DOCX before the transformation can take place.
- Note: The DOC-to-DOCX conversion tool is provided by Microsoft Office, and I have only tested it on Windows. Therefore, this project is currently compatible only with Windows systems.
skilled-based project
- Facebook Group Scraper
- Written in Python 3.7, leveraging the requests library to extract group posts, comments, and replies, featuring pagination support and configurable wait times between requests.
- Highly inspired by kevinzg/facebook-scraper.
- GitHub: wspooong/facebook-group-scraper