This is the alignment-related excerpt from the full reflection text (included below.)

——
I had a hard time deciding what to do for this project but I knew I wanted to explore something that was interesting to me and I wanted to take advantage of the many tools that are out there.  I think that there was large amount of alignment between what I set out to explore and the tools and data I used. I was thinking about data sources that would be abundant and relevant to my interests so I settled on song lyrics. I saw that I could incorporate google search, beautiful soup and linguistic analysis. I formed my project based on the tools that I found interesting and wanted to learn more about. I wanted to incorporate as many as I could while still generating a meaningful result. I had the assumption that artists’ songs would fall father to one end or another of the polarity scale and by plotting the positive scores verses the negative scores, I could try and find a correlation if any existed. 

I did come across a few limitations by the tools that I chose which I hadn’t expected when I first set out. It was difficult to assure that I could identify the correct url for the chosen artist because the order of top hits is unreliable. I also had to work around LyricFinder’s collection of songs. It was challenging if the artist wasn’t well known enough because their search results were sparse and their wasn’t as much data to work with. Also, as I mentioned before, it wish that I could have gotten the figure I created to reveal the song title when the viewer hovered over each point. This would have been useful to further inspect specific points that were particularly interesting like outliers or ones along either axis. 

I feel as though my text mining, stripping, and sentiment analysis worked very when when the artist was famous and had a significant body of work. Parsing through the HTML was pretty standard for most of the songs which leads me to believe that the process ran smoothly on artists like Adele, Taylor Swift, Elton John, and Elvis Presley. On lesser known artists, it was not as reliable because it wasn’t guaranteed that the right URL would be found in the google search and the song selection was not as abundant. I can’t say that one could conclude much from the results due to the way I designed the visualization, but I would say the analysis I performed aligned well with the tools I set out to explore and learn about. 

— end of alignment excerpt


Student’s full reflection:

For this project, I created a program that asks the user for a music artist, scrapes the lyrics of their songs, and performs a sentiment analysis on them. In order to show results for a specific artist of the user’s choice, I used google and lyricsfreak.com as data sources. Once I parsed through the HTML and gathered the lyrics of each song, I used a sentiment analyzer to gauge the overall mood of their songs. I hoped to get more comfortable using python to interact with information on the web and use an appropriate computational analysis on it. Additionally, I  wanted to create a visual representation of artists’ lyrics to compare and contrast in hopes to find trends. 

Once the user enters an artist’s name, my program performs a google search with those terms and finds the artist’s song list on the LyricsFreak website. I looked at different lyric websites but I chose this one because the body of the lyrics was easy to identify and easily accessible using Beautiful Soup. From the artist’s page, I extracted the URLs of each song and went to each individual page to scrape the lyrics. Keeping the lyrics from each songs separate I used the NLTK sentiment intensity analyzer to find the polarity scores of each song. Finally, using matplotlib, I plotted each song by the selected artist on a scatter plot using the negative and positive scores from the analysis. 

Since many artists have a lot of songs, the graphs got rather crowded when I labeled each point which made it difficult to read. I tried for a long time to figure out how to get the labels to appear when they were hovered over, but I was unsuccessful and could not find a way to interact with the figures I generated. Instead, I randomly labeled a few of the songs so that the viewer could identify specific songs to make sense of the plot but still be able to observe any trends. 
The final representation of my information is a scatter plot with every song’s positive and negative sentiment values of a particular artist. One thing that surprised me was that the songs turned out to be a lot more neutral than I had expected. I tend to think songs written by an artist have the similar tones but judging by the analysis I performed, I don’t think I can fully support that hypothesis. However, there were slight trends, for example Adele’s songs were more negative than positive for the most part but generally the points were pretty scattered. Also, I thought it was interesting to look at the songs that were very close to either axis. This suggests that they were almost entirely negative or positive and more skewed to one side or the other.

I had a hard time deciding what to do for this project but I knew I wanted to explore something that was interesting to me and I wanted to take advantage of the many tools that are out there.  I think that there was large amount of alignment between what I set out to explore and the tools and data I used. I was thinking about data sources that would be abundant and relevant to my interests so I settled on song lyrics. I saw that I could incorporate google search, beautiful soup and linguistic analysis. I formed my project based on the tools that I found interesting and wanted to learn more about. I wanted to incorporate as many as I could while still generating a meaningful result. I had the assumption that artists’ songs would fall father to one end or another of the polarity scale and by plotting the positive scores verses the negative scores, I could try and find a correlation if any existed. 

I did come across a few limitations by the tools that I chose which I hadn’t expected when I first set out. It was difficult to assure that I could identify the correct url for the chosen artist because the order of top hits is unreliable. I also had to work around LyricFinder’s collection of songs. It was challenging if the artist wasn’t well known enough because their search results were sparse and their wasn’t as much data to work with. Also, as I mentioned before, it wish that I could have gotten the figure I created to reveal the song title when the viewer hovered over each point. This would have been useful to further inspect specific points that were particularly interesting like outliers or ones along either axis. 

I feel as though my text mining, stripping, and sentiment analysis worked very when when the artist was famous and had a significant body of work. Parsing through the HTML was pretty standard for most of the songs which leads me to believe that the process ran smoothly on artists like Adele, Taylor Swift, Elton John, and Elvis Presley. On lesser known artists, it was not as reliable because it wasn’t guaranteed that the right URL would be found in the google search and the song selection was not as abundant. I can’t say that one could conclude much from the results due to the way I designed the visualization, but I would say the analysis I performed aligned well with the tools I set out to explore and learn about. 

I am very pleased with the skill set I learned from this project. This was one of the first projects that allowed for some freedom in what topics and concepts we wanted to dive deeper into. I was definitely challenged by this project. One of the biggest changes I can pinpoint was visualizing the flow of the program and going from what was in my head to lines of code. I had to make decisions about which functions would handle individual song lyrics and which ones would be performing a function on a list of scrapped information. It helped me to write out the different functions I thought I needed in plan english along with the parameters and return objects and to create a flow diagram. I felt as though this was well scoped and that I gained a significant amount of independence during this project. Although I might not use the specific results from this project, I now have a handful of new tools that I look forward to incorporating in my future projects.