Final Group Project Reflection

When our final project was first introduced, I didn’t think there was anyway we would even finish a small portion of it. There just didn’t seem to be enough time to put an entire project together using DH tools, and not to mention analyzing, which would’ve been hard had we not finished. Not to mention I had no idea what kind of “thing” we would come up with to analyze. Nonetheless, this class proved me wrong.

If I must be honest, I didn’t understand Digital Humanities at all until working on this project. Yes, we read a lot of articles about it and saw a few projects but I never really understood the point. It seemed like a bunch of work to put something together that maybe a few people will look at. The idea of DH is very interesting to me now, especially after contributing to our own project.

Working on this project taught me things I never really thought about until writing this reflection. I learned how to collaborate with not just my small group but a large group (the class) as a whole. Quite obviously, as this class states, I learned about different DH tools, data collection, research, analysis etc. but there were also aspects I took away that weren’t exactly limited to “DH” terms. After completing this project, I realized how crucial it is to collaborate with others and have a place where your group is able to locate and access everything you’re using. I think that’s an aspect my group hit right on the head. After we began our project and started saving the word clouds Voyant produced for us we realized how many files we were going to have in order to complete our project. We immediately created a Google Drive folder where we kept sharing our files as we all finished them so they were easy to locate in the future.

Looking back, I would’ve put more time in the data collection portion of it. After working with different tools and having to change our format well over a few times, I think if we had spent an extra hour focusing on the size and accuracy of our corpus, we would have saved time on reorganization in the long run. Despite the fact we had to change our format in order to use some tools, we finished our project. Through this, I learned that your corpus is crucial in a DH project and it’s important to know your tools well prior to using them (emphasis on the well). Had we spent more time on our corpus at the start, I think we could’ve created better visual aids to display our results and perform a more broad analysis of our research question. Another area I would alter or look into in the future would be our use of Voyant. After looking over our corpus at the end of our project I realized the mission statements were all different lengths. For example, Voyant would pull a word cloud of about 20-30 words from a short mission statement and it could be the entire statement, making it a bit inaccurate. I think this could’ve been prevented had we spent more time collecting and surveying our data, or if we had used another tool. That being said, I think it was crucial for us to provide visuals using just the top keywords (words used the most) to keep it somewhat accurate across all of the mission statements, no matter the length.

I loved experiencing what it’s like to work on a DH project. We learned a lot of things along the way. In reflection, I think it would’ve been helpful to learn more and ask more questions about the collaboration process of a DH project. It seems as though working with a large number of people can become difficult at times, as we experienced at the start. Even though our class isn’t very big, I think we made the right decision by breaking up into smaller groups. At the beginning of this project we were all having such a hard time agreeing on things and getting things done it only made sense to break it up according to our interests. I felt like we almost wasted a bunch of time. Because we broke up into smaller groups, we never really got to experience what it was like to collaborate with a larger group. Yes, we put all of our projects onto a website, but I didn’t know what kinds of things the other groups were doing until I read their blog posts or until their projects were up on the site. I don’t know if this is what a lot of DH projects are like, or if they work more together to complete them. I definitely think it would’ve been helpful to us to ask MATRIX and the DH library more about collaborating and completing these projects prior to working on our own. However, we didn’t know these things would come up while working on our own and it was a good learning experience to overcome the obstacles we faced.

For future projects, I now know to spend a lot of time in the data collection portion of a project in an attempt to create a more organized corpus. I know that collaboration is an important aspect in not every group project, but especially in DH projects, and I know that it’s important to have a common area among group members in order to save your work and resources so that they are accessible to every member.

Even though I would change some things about our project in order to make it easier and more accurate, I believe it served its purpose in making our research question come to life through visuals. I learned more than expected by working on this project and I’m happy with what we were able to complete. We did more than I originally thought we would, and learned a lot of lessons through the process. This project taught me the concepts and purposes of digital humanities more than anything else we did in the class. I hope to incorporate my learnings and create my own digital humanities projects in the future. Overall, it was a great learning experience.


Group Reflection (Week 3)

This was a short week and that didn’t seem to be a problem with my group at all. We all worked accordingly with the time we had and finished everything up. Our next step is to work on our analysis. Today we uploaded our data onto Palladio and organized the graphic to tailor to our research question. At first there was no organization whatsoever to the graphic Palladio produced so we had to make some changes so it was easier to read. We put the colleges on the North end, and the colleges on the South end to mimic the Red Cedar river with remaining connecting words floating in the middle (words that are used in more than one college’s mission statement). You can see the graphics below:

Keywords in the mission statements (provided by Voyant)

Keywords in the mission statements (provided by Voyant) using Palladio.

Top keywords in mission statements (provided by Voyant)

Top keywords in mission statements (provided by Voyant) using Palladio.

Originally, we had only planned to create the first graphic which uses all of the keywords, but after seeing how cluttered and unreadable it was we decided to run it through with only the top keywords (the words used the most), which happened to clean it up a lot. After creating our graphics in Palladio we decided to change the structure of our csv file and see what we could do with RAW again. RAW actually ended up working this time because of the way we organized our file and inputted the data. Before when we were working with RAW we were not organizing our data correctly so it wasn’t bringing up any of the connecting words. You can see the RAW graphic below:

Top words in mission statements (provided by Voyant) using RAW.

Top words in mission statements (provided by Voyant) using RAW.

Through these graphics you’re able to break down the topics of each college while matching it with the direction of the river in which it’s located. I think it’s important to use different tools to allow your audience to look at different things because some they might not understand as much as others and they all tie in together. It’s better to have more graphics to back up your research question. Each of our graphic conveys the same point, yet in different ways which is great for when we start our analysis next week. I think that creating all of our visuals before Thanksgiving break was very beneficial and helpful so that we’re able to spend next week analyzing them.

Before working on this project, I started questioning whether or not I really liked Digital Humanities and I couldn’t figure out what the big point was. Now, I can confidently say that I love DH. I enjoyed working with the many tools and I even liked running into problems and trying to figure out another solution. It’s amazing to see how many tools are available to use and what they’re capable of. In our project alone we used 4 (Voyant, RAW, Palladio and Google Fusion Tables). Digital Humanities allows you to study data in different ways which is what makes these projects so interesting. I also think collaboration is an important part in these projects because there are so many different aspects to it no matter what kind of question you have or what kind of data you have to work with.

The graphics and map we created allow us to visualize our research question perfectly. While there are some common words across some colleges that don’t exact follow a relevant key word for a specific college like “world”, it’s still interesting to see what some of their other words are accompanied with their location. I think we’ll start looking at the exact words we’re working with in our graphics and add that factor into our analysis. As of right now, we haven’t had a chance to look that closely.

Palladio/Group Reflection (Week 2)

The first week of group work my group really did the majority of our research,  data collection, and planned the structure of our project. This week we started putting it all together.

We spent a lot of time on RAW trying to figure out exactly how to accomplish what we were trying to, but we fell short of data needed to use RAW. Our goal was to find a way to create a chart that connects the common words used in each mission statement and while RAW could document in each word in the mission statements, it wasn’t connecting the words that were common.

Although this has nothing to do with our current research question (Are the colleges located North of the river more arts based while South of the river is more STEM based–science, technology, engineering and mathematics?), it’s another aspect we can add to our project in order to compare the mission statements. Unfortunately I couldn’t make it to class on Thursday, but my group informed me of the tool Palladio, which truly turned the tables.

After spending SO MUCH TIME with RAW and ending up frustrated because it wasn’t working, an answer was given to us in class on Thursday. I haven’t had much time to work with Palladio to see all that it has to offer I’m confident it will be able to produce the exact graph we’re looking to create this coming week.

So far my experience with this project has been a positive learning experience, and I think my group has a lot to do with that. We all came to agreements and stuck to our tasks. We’ve been finishing everything surprisingly quickly and have been able to solve our problems pretty quick as well. I’m anxious to see our final product once we’re done working with Palladio and our map. I’m also anxious to see what the other groups are creating and to see our big project as a whole. As a student, this project proves to be interesting, and by splitting up into groups we’re able to analyze the mission statements of the Big Ten schools, the colleges within MSU, and more using the same corpus.

Group Project Reflection (Week 1)

Before this week I was starting to get nervous about the outcome of our project. It seemed like we didn’t have enough data to work with and a limited amount of research questions. When we split into groups according to everyone’s particular angle of interest with our mission statement focus it started to created a better picture.

My group chose to focus on the mission statement’s within the different MSU colleges and map them out in Tableau with their topics. Research question: Are the colleges located North of the river more arts based while South of the river is more STEM based (science, technology, engineering and mathematics)? The tools we chose to use were the Topic Modeling Tool and Tableau Public. We started by creating a Google Doc we shared with each other where we recorded our tasks/steps we take, added all of the colleges and their mission statements and their locations by address.

I was surprised at how quickly our group started putting everything together. So far we’ve met every single one of our tasks we set for ourselves to do, which completely makes or breaks a group project.

We ran into a few problems last week when working with our data and solved them pretty quickly. Since we chose to topic model we began to run the first mission statement through the tool and realized it was too short (one sentence) for the tool to work correctly. We had to find another way to find key words/topics, so we resorted to Voyant.

Although Voyant doesn’t provide us with exact “topics” we are still able to find the key words within each mission statement and use those as topics. Of course the bigger the words in the Voyant graphic, the more times their used in the statement. Here’s an example of the Eli Broad College of Business mission statement after we ran it through Voyant (keeping in mind that we used the stop words list and added a few we agreed on like college, university, michigan, state, and a few others we found within each college).


Trying to get all the words in our word cloud to fit inside the window to screenshot was a problem Laura and I kept running into. We ended up having to refresh the page several times so the words would replace until they all fit. For a few of the colleges we even had to add a few words to the stop words list that we felt were unimportant or less important than the others if refreshing the page wasn’t adjusting them the way we wanted.

Collecting these photos led us to create a Google doc folder. Since we have so many different aspects of our project we realized we needed to keep everything organized and located in the same place. This makes is easy for all of us to access, and easier to refer back to, if needed.

We’re planning on saving each of these word cloud graphics and added them to our map on Tableau Public, which was another interesting tool to experiment with using our data. While Laura and I worked on running the mission statements through Voyant and saving the graphics, Katie worked on the data for Tableau. Dividing up the work in this way allowed us to quickly finish our tasks in order to make more progress on our project. While Katie was modifying our data to work in Tableau she ran into a few problems. We originally thought that by looking up the addresses of each college and finding the latitude and longitude points we’d be able to map then in Tableau. After finding out that Tableau doesn’t accept lat/long points she decided to use complete addresses and map it that way.

I assume that things like this come up a lot in DH projects if you don’t do enough research on the tools you’re using. When we practiced using these tools in class we were always given data, and now we’re creating it and learning more about the format needed for these tools. Sometimes we come to a dead end and have to find a different route/different way to do what we’re trying to do, and that’s DH for you! It’s almost like a maze; there’s only one end and a handful of different ways to get there. If you run into a wall you take another direction and find your way out.

An important part of a DH project is recognizing any limitations and surveying the way your tools work with your data and the data itself. As a class we only obtained the mission statements of the Big Ten universities, but my group was more interested in working with a corpus within MSU so we had to go find our own data.

Obviously, we can see whether or not North of the river is more artsy schools and South is more STEM from just looking at a map but we’re interested in seeing if the mission statements convey the same idea (what are the college’s intentions? What do they focus on? What do they value? etc.) Focusing on one research question we were all interested in is allowing us to put together a project efficiently and easily.

After looking at our project put together as a whole and obtaining comparisons with the key words/topics of each college’s mission statement we’re going to look at MSU’s mission statement and see how the topics we pull from each college compare. Hopefully this will provide a better connection with the other group’s findings as well when we move into collaboration.


At first I went and filtered all of the library’s in the dataset to see what type of wifi they offer. Later I found out that all of them were in fact “free” so that question didn’t really get me anywhere.

How does the wifi in lower-income communities compare/contrast to higher income communities?

Hypothesis: lower-income communities have more free than free-based wifi.

Since our data is structured around zip codes I filtered in the zip on the marks and chose to layer the map by zip and household income. I also added place names to this layer. I filtered the type of wifi by color and noticed a lot more blue dots than I did orange dots (free-based:blue, free:orange). After looking at the breakdown of household income I noticed there are two areas that were of the lowest income: the Bronx and Brooklyn. Within these lower income cities I thought that there might be more free than free based wifi locations.

Tableau Map of wifi type vs. income


I decided to filter in the names of the locations on the map in the marks tab to take a look at the few locations that were actually free on the map (not just within these lower income cities). It was annoying that I had to click on each dot in order to see what it was though. I found that the libraries, a few cafe’s/lounge’s, a bar, a FedEx kinkos, and a condominium complex were among the free locations on the map.

After thinking about these few free locations I realized that it does make sense no matter what the income is in these areas. Again Starbucks’ and other higher end businesses are typically in the higher income areas, but I thought it might be different. However it does make since that the majority of the business’s are free-based because it wouldn’t be right to just go in and use a place’s wifi without actually buying something there or being there for another reason other than their wifi right? I noticed that the higher income areas of the map are heavily populated with free-based wifi as well. Again, there are a lot of free-based places in the lower income areas but the higher income areas like Manhattan are more populated with more of those type of businesses.

I think it would be interesting if we were able to see the populations of these areas or the frequency of the use of wifi at certain types of businesses (both free and free-based). I did Google the population of these places: as of 2013, the Bronx has 1.4 million people, Brooklyn: 2.5 million, Manhattan:1.6 million, and New York City: 8.4 million. Out of these cities the Bronx and Brooklyn are of mainly lower income, and NYC and Manhattan are of higher income. There is a mass amount of tight-knit dots within the two higher income and more heavily populated cities, which makes sense to me. I chose to look at this information by filtering the number of records into the size option in the marks box. You’re able to see how many locations are in which area by the size of the dot. If you click on each dot it gives you the number of records.

Tableau Map of Records


Furthermore, I realized the type of wifi that is offered has nothing to do with where you live, and has everything to do with what type of businesses you live by. It’s all about the location of these places. Obviously if you live out in the Bronx near the NYC Public Library locations you’re going to use their free wifi. On the other hand, if you live in the busy streets of Manhattan you’re going to be surrounded by a ton of businesses that offer free-based wifi, so just be prepared to get out your wallet and their so-called “free-based” wifi will be “free” to use.

Overall this mapping tool is very interesting. A lot can come out of it if you have the right data. The most frustrating thing was not being able to filter or mark the data as you wanted. For example, I kept wanting to color code the groups of libraries and Starbucks’ and McDonald’s etc. but it couldn’t differentiate between them. I’m looking forward to see if there is a way we could use it for our final project.

Problems: I had a lot of difficulties saving my second map on Tableau to the web. I kept getting an error that said “invalid database name value”. After two days and a million attempts to save I finally just reopened the excel workbook of the data and renamed it as something else. I then went back to Tableau, opened the newly named data, recreated my map and I was able to save it then. My screenshots were also coming up with errors every time I tried to add them into this post. I kept getting an error that said “try again later”. After hours of frustration I then just decided to message myself my screenshots, save them on my phone, log onto WordPress, and upload them that way instead of on my laptop.

TMT: Grange Visitor Analysis

 How do the main topics from the 1876 Grange Visitor publications and the 1896 Grange Visitor publications compare/contrast?

1876 top 3 topics:

1. Grange State Order Business Secretary Committee Association Work Officers Plaster

2. Visitor Good Time Office Interest Money Patrons Price Orders County

3. Grange Granges Members Master Meeting St Prices Sec Subordinate List

1896 top 3 topics:

1. State Work Michigan Make School Made Man Large Years Great

2. Time Farm Life Day College Men People Place House World

3. Grange Good Mrs County Farmers Present Free Kathleen General Price

While I know that the Grange Visitor spans from 1875-1896 I thought it would be more interesting to look at it based on the 20 year span within the publications. Looking at the top 3 topics the TMT provided for the year 1876, it seems to be more about government. In contrast, the publications from 1896 seemed to be more about education, farming and labor.

I thought these main topics were particularly interesting especially because 1876 and 1896 were both election years. It makes sense that the 1876 topics were more government based, which is evident with the words: state, order, office, patrons, meetings, subordinate, etc. Most, if not all of the topic words can be related to government in some way. The 1896 topics seemed a little more focused on other aspects, not just the government. I’m assuming this could be because in 1876 the Grange was only publishing a newspaper once a month, and by 1896 they were well past publishing twice a month. It could also be because there were different pressing issues in the two different years, which I would have to do further research on to find out.

I was surprised to see both the topics of school and college in 1896 and not in 1876. This could be attributed to the possibility that people started viewing education as an important aspect in life in 1896. I was also surprised to see that farm and labor were in 1896 and not in 1876. That seems like something that would typically come earlier. Considering that these were both election years, I expected both of them to be more government based like the 1876 topics.

Although I don’t know much about these two years as a whole, it’s interesting to see how things can change in 20 years. I think it would be beneficial to look up certain things that happened during these 20 years and see if that effected their publications due to societal changes.

In order to look at this further I decided to add in the year 1886 to draw some more conclusions. Here are the 3 main topics from the year 1886:

1. Grange Good Make Mich Year Order Ft Patrons Water Stock

2. Farmers Man County Men Time People Farmer Work Mrs Ing

3. Day Made State Good Michigan Children Great Labor Give Place

Given that this was not an election year, these topics definitely make more sense. I’m now able to see the shift from the topic of government in 1876 to the focus on farmers and labor in 1886. However that still doesn’t explain the fact that the 1896 topics (an election year), were more similar to the 1886 topics (a non-election year), rather than the topics of 1876 (also an election year).

After comparing all 3 years, 1886 and 1896 seem to be very similar. I’m wondering if that has anything to do with the fact they they both published twice a month instead of once a month like 1876, or if I had done more topics would all 3 years be a little more alike?

I only chose to look at the top 3 topics in order to make it easier to make comparisons. I’m assuming if I had done 5-10 topics I might get more similarities between these 3 years.

I’ve also noticed that Michigan is a common topic in the latter years (1886 and 1896) than in 1876. I wonder why that might be. Did they not see a point to use the word Michigan a lot in their early publications seeing that it is a Michigan newspaper? They had to have talked about topics in Michigan considering it is a Michigan publication so maybe they really didn’t see a need…I suppose I could look further into the articles and see when and where they used the word Michigan if they did and why. All 3 of these years do contain the word state, which is also interesting. I wonder what context they were using it in.

Through this assignment I was able to see how the further you get in your research and analysis the more questions you want answered. I enjoyed analyzing this publication through the topic modeling tool. I love how it allows for an open interpretation by just simply giving the main “topic” words in a cluster. I can’t wait to start using other DH tools for our own project.

Research and Collaboration

Today opened another side of DH I knew about, but never put much thought into. Previously, I thought that in order to do a DH project you need to do your research, complete your tasks, and collaborate with others. Now, I realize how completely chaotic and stressful that could be. Given that most of the debate was on our project topic today, I feel like the rest was just a bunch of dead ends. Of course it would’ve been ideal to have all the data we need just fall into our laps but that was not the case. After surfing the web for different databases and archives we didn’t exactly find what we were looking for, which then turned into a new topic angle with new research questions. In some cases it could even turn into a new topic in general.

I’d imagine this happens in a lot of DH research, and I applaud DHers that spend a lot of time and effort on their research (as they should). Through our readings I knew that research was an important aspect to DH, but I never knew how much time and effort could and should be spent on it. With just 30 minutes of research with about 10 people we thought we’d at least be able to locate an archive with MSU’s Mission Statements but we were only successful in finding a few.

Collaboration is another aspect of DH I thought would be a little easier, and I assume it will get easier as we are each assigned tasks (and hopefully complete them). As for right now, it seems a little rocky. Group projects are one thing, and I feel like a class project is another. With so many (but so few) of us in different areas of study I realized our possibilities with this project could be endless. It seemed like every time we chose a topic we got deeper and deeper into it and started adding new ideas with different angles based on our assumptions on data collection etc. Thinking critically about research and topic ideas is a very important skill to have when working on a DH project and we did that pretty well while collaborating in class so I feel like we’re getting on the right track.

DH is a reality to us now. We’re actually doing the work that Kirschenbaum talked about in his article. DH is a methodological interdisciplinary field in academia that focuses on the dissemination of knowledge through technology, and today we found that the internet really doesn’t have all of our answers (surprising right?), so we have to go deeper. Research is work, DH is work, and we’re finally getting ourselves into that work.

Finally, after exploring our options data-wise I can see how different questions and project angles could come up. However, I feel like it’d be easy to get off track if you want all of these interesting questions answered and your DH project just ends up being this huge mess without giving an actual message or getting a point across. You wouldn’t want your viewers saying “cool, now what?” I feel like it needs to have a point. As we continue to do our research it’s important to weigh our options and consider the data that is available to us, and right now it seems to be limited (within MSU’s history). So, looking at different colleges within MSU might be the way to go.

I’m definitely looking forward to seeing where this project will take us and learn more about these aspects of DH.