About

Get to Know Us

Our About page explains why we chose social media as our project topic, introduces our team members and their roles, and acknowledges the support we've received. It also describes the resources and methods we used for data collection and analysis.

Topic Selection

The Reasoning Behind Selecting Social Media's Role in Language Evolution as Our Project Topic

As a group, we recognized that social media was a common interest among us. We observed how social media has evolved rapidly over the years and how we've kept up with these changes even before college. It became evident that social media has consistently played a significant role in our lives, influencing us in various ways. One of the most notable impacts was on our language use, particularly slang. We all noticed that slang became a prevalent part of our communication, prompting us to explore the broader effects of social media on language evolution.

Explanation of Resources & Data Decisions

For our TikTok dataset, we used a web-scraping tool provided by a Google Chrome extension, ParseHub, to scrape our data. We chose this tool because TikTok is accessible via the internet, and since it has a similar format for each post, the web-scraping tool can easily gather data while we navigate from post to post. After collecting our data, we decided to clean it. We kept columns such as the web link, username, user comments, and posting date. These columns combined give us an accurate representation of what was said, who said it, and when it was said. Keeping the date is especially important as it allows us to create a visual timeline of when the word “situationship” was most relevant. We also cleaned the data by removing unnecessary vocabulary and emojis, as they were not appropriate for our analysis. For example, common words like “a,” “and,” and “https” appeared in nearly every comment but were not related to “situationship.” Keeping these words would distract from the analysis and stray us away from the true interpretation. We did all this data cleaning in Visual Studio Code, using Python. We chose Python over other programming languages because it has more tools and libraries for text analysis. Python's relevance in machine learning, large language models, and natural language processing also provided us with more resources. We organized our dataset in CSV (Comma Separated Values) format to be more organized and make the data easier to analyze. This organization allowed us to use a sentiment analysis tool to analyze the frequency of each vocabulary word and their relative frequencies.

For our Reddit dataset, we followed a very similar process, using BrowserFlow to scrape our data and converting it into CSV format. However, this time we chose to store it in Google Sheets, as Google Sheets has an easy-to-use filtering process and a simple formula converter. We stored the data in columns for web link, subreddit name, year posted, age of the poster, gender of the poster, and post title. We obtained age and gender data using the Google Sheets formula converter and included these columns to show the different demographics of users utilizing the word “situationship.” Some entries in the age and gender columns were blank or not applicable, so we decided to filter out this data to ensure an accurate representation of the age and gender demographics. This decision to use Google Sheets was based on its user-friendly tools that helped with our dataset's characteristics and analysis.

Team Biographies

Chloe Chang
chloechang@berkeley.edu

Reddit Data Collection & Visualization, TikTok & Reddit Data Cleaning, Data Critique, Website Design & Implementation, Image Sourcing, Team Organization & Timeline, Final Review Editor

My name is Chloe and I am a third year, studying Data Science with a domain emphasis on Business and Industrial Analytics. I am also pursuing the Berkeley Certificate in Design Innovation. I am passionate about using design to connect people with the tools and opportunities that improve and enrich their daily lives. In DIGHUM 100, I have learned about many critical theories that I plan on applying to my for design, analyzing data and its contexts to create products based on human understanding.

Tera Chant
terachant@berkeley.edu

Project Narrative (Influence of Social Media on Language, "Situationship" and Social Media, Analytical Models), Website Implementation

Hi, my name is Tera and I’m a rising junior majoring in Data Science and Economics. I’m interested in utilizing data-driven analysis to uncover valuable insights towards developing innovative solutions. Through this class, I’ve come to recognize the various aspects of digital humanities that impact both the analysis and reach of data. Specifically, through this project, I found social media to play an integral role in shaping my language choices and thus wanted to explore the relationship between social media and the spread of slang.

Jai Gupta
jai.gupta@berkeley.edu

Project Narrative (Influence of Social Media on Language, Theoretical Lenses), Acknowledgements

Hi, my name is Jai Gupta. I am a rising junior, majoring in Economics and Data Science. I have a deep interest in the field of Real Estate and Proptech. Taking this class has taught me how to navigate the field of AI, understand its principles, and apply it to the Real Estate field. I am taking this class while working on my Proptech startup. Applying the different theories from class to the real world has given me amazing insights and results. Using Marxist theory to understand what the masses feel about certain price points helped boost sales. I want to continue to explore the Data Science field and apply it to other non-technical fields. Feel free to reach out if you have any questions!

Amber He Wei
he.wei.25@berkeley.edu

TikTok Data Collection & Visualization, TikTok Data Cleaning, Data Critique, Analytical Models

Hi, my name is Amber He Wei. I am a rising senior, majoring in Applied Mathematics and Data Science. I am interested in the field of optimization. As I continue to explore Data Science, I want to delve deeper into the intersection of machine learning and mathematical algorithms. Taking this class has taught me how to use data more ethically and effectively. Through the final project, I gained a better understanding of the relationship between social media and the evolution of slang usage.

Nathan Huynh
nathan1nathan@berkeley.edu

Project Narrative (Introduction to Social Media, "Situationship" and Social Media, Methodology, Analytical Models), About Page

Hi, my name is Nathan and I’m a rising junior majoring in Computer Science and minoring in Data Science. My main interests are computer science education and full-stack development. Taking this class made me realize the importance of connecting humanities with technological improvements. Feel free to reach out if you have any questions!

Jessica Ma
yma55@berkeley.edu

TikTok Data Collection & Visualization, TikTok Data Cleaning, Data Critique, Analytical Models

Hi, my name is Jessica, and I’m a junior double majoring in Economics and Data Science, with a keen interest in AI training and its underlying principles. By taking this course, I have come to recognize the profound connection between technology and humanities. As I continue to explore Data Science, my vision is to push the boundaries of what is possible and contribute to a more inclusive and equitable future for all.

Yash Mantri
yash.mantri@berkeley.edu

Project Narrative (Historical Context, Methodology)

Hi! My name is Yash and I’m a rising junior majoring in Economics and Data Science, passionate about leveraging data to tackle real-world business challenges. Taking DIGHUM 100 this summer has enabled me to discover how data can be used to uncover patterns in literature, analyze historical trends, and even map out cultural phenomena, broadening my overall understanding.

Acknowledgements

We would like to acknowledge and send our gratitude to the following people and organizations for their support and assistance throughout the course of this project:

Dr. Scott Caddy, for introducing most of us to the wonderful world of digital humanities.
Rosa M. Norton, for providing constructive feedback and helping us refine our project.
The Reddit and TikTok communities, especially the users of relevant subreddits and TikTok accounts related to the term “situationship,” whose content was central to our research.
The creators of ParseHub and BrowserFlow, for assisting with data scraping by developing the tools we used.
The contributors of Vecteezy Stock Images and Wix Stock Images, for providing us multimodal elements that help illuminate our overall narrative and make our website more of an enjoyable experience.

< Go back to Homepage

About

Get to Know Us

Topic Selection

Explanation of Resources & Data Decisions

Team Biographies

Chloe Chang chloechang@berkeley.edu

Tera Chant terachant@berkeley.edu

Jai Gupta jai.gupta@berkeley.edu

Amber He Wei he.wei.25@berkeley.edu

Nathan Huynh nathan1nathan@berkeley.edu

Jessica Ma yma55@berkeley.edu

Yash Mantri yash.mantri@berkeley.edu