Topic Selection
The Reasoning Behind Selecting Social Media's Role in Language Evolution as Our Project Topic
As a group, we recognized that social media was a common interest among us. We observed how social media has evolved rapidly over the years and how we've kept up with these changes even before college. It became evident that social media has consistently played a significant role in our lives, influencing us in various ways. One of the most notable impacts was on our language use, particularly slang. We all noticed that slang became a prevalent part of our communication, prompting us to explore the broader effects of social media on language evolution.
Explanation of Resources & Data Decisions
For our TikTok dataset, we used a web-scraping tool provided by a Google Chrome extension, ParseHub, to scrape our data. We chose this tool because TikTok is accessible via the internet, and since it has a similar format for each post, the web-scraping tool can easily gather data while we navigate from post to post. After collecting our data, we decided to clean it. We kept columns such as the web link, username, user comments, and posting date. These columns combined give us an accurate representation of what was said, who said it, and when it was said. Keeping the date is especially important as it allows us to create a visual timeline of when the word “situationship” was most relevant. We also cleaned the data by removing unnecessary vocabulary and emojis, as they were not appropriate for our analysis. For example, common words like “a,” “and,” and “https” appeared in nearly every comment but were not related to “situationship.” Keeping these words would distract from the analysis and stray us away from the true interpretation. We did all this data cleaning in Visual Studio Code, using Python. We chose Python over other programming languages because it has more tools and libraries for text analysis. Python's relevance in machine learning, large language models, and natural language processing also provided us with more resources. We organized our dataset in CSV (Comma Separated Values) format to be more organized and make the data easier to analyze. This organization allowed us to use a sentiment analysis tool to analyze the frequency of each vocabulary word and their relative frequencies.
​
For our Reddit dataset, we followed a very similar process, using BrowserFlow to scrape our data and converting it into CSV format. However, this time we chose to store it in Google Sheets, as Google Sheets has an easy-to-use filtering process and a simple formula converter. We stored the data in columns for web link, subreddit name, year posted, age of the poster, gender of the poster, and post title. We obtained age and gender data using the Google Sheets formula converter and included these columns to show the different demographics of users utilizing the word “situationship.” Some entries in the age and gender columns were blank or not applicable, so we decided to filter out this data to ensure an accurate representation of the age and gender demographics. This decision to use Google Sheets was based on its user-friendly tools that helped with our dataset's characteristics and analysis.
Team Biographies
Acknowledgements
We would like to acknowledge and send our gratitude to the following people and organizations for their support and assistance throughout the course of this project:
​
-
Dr. Scott Caddy, for introducing most of us to the wonderful world of digital humanities.
-
Rosa M. Norton, for providing constructive feedback and helping us refine our project.
-
The Reddit and TikTok communities, especially the users of relevant subreddits and TikTok accounts related to the term “situationship,” whose content was central to our research.
-
The creators of ParseHub and BrowserFlow, for assisting with data scraping by developing the tools we used.
-
The contributors of Vecteezy Stock Images and Wix Stock Images, for providing us multimodal elements that help illuminate our overall narrative and make our website more of an enjoyable experience.