![https://media3.giphy.com/media/L0qTl8hl84EDly62J1/giphy.gif?cid=ecf05e47wl2uvkvz3dxsp1axa4gf5tsk7s7nqytg7vwadj38&rid=giphy.gif&ct=g](https://media3.giphy.com/media/L0qTl8hl84EDly62J1/giphy.gif?cid=ecf05e47wl2uvkvz3dxsp1axa4gf5tsk7s7nqytg7vwadj38&rid=giphy.gif&ct=g) I really love documentaries! Last weekend I was watching a Netflix documentary called [This is Pop](https://www.netflix.com/br/title/81050786), because it was Analytics Contest time and I thought: Why not creating a pop song analytics with InterSystems Iris? The first challenge was the database. I found on [Data World project](https://data.world/typhon/billboard-hot-100-songs-2000-2018-w-spotify-data-lyrics) a CSV file with the Billboard hot 100 songs from 2000 to 2018, created by "Michael Tauberg" @typhon, that fits perfectly. I was talking to @Henrique.GonçalvesDias and he gave me the idea of using [Microsoft Power BI](https://powerbi.microsoft.com/) for good looking report with charts. ### Which genres were most popular between 2000 and 2018? ### Which artists had more songs on Billboard? ### Which year had more dance songs? So let's analyze the data set, with a help of [csvgen](https://openexchange.intersystems.com/package/csvgen) imported the CSV file. The data set contains: **Title** — Title of the song **Artist** — Name of the Artist **Energy** — The energy of the Song — higher the value, more energetic **Danceability** — higher the value, easier it is to dance to the song **Loudness..dB.**. — higher the value, louder the song. **Liveness** — higher the value, more likely the song is a live recording. **Valence**. — higher the value, more positive mood for the song. **Duration_ms**. — The duration of the song in miliseconds. **Acousticness**.. higher the value, more acoustic the song **Speechiness**. — higher the value, more spoken word the song contains **Lyrics** — Song lyric. **Genre** — Musical Genre of Song. On the CSV file the Genre is an array like this **[u'dance pop', u'hip pop', u'pop', u'pop rap', u'rap']** My idea was to create a table for Genre and another table to solve the N:N relationship. A simple script on data populates this tables. After that, just connect the Power BI on InterSystems Iris ([here a step-by-step how to do that](https://community.intersystems.com/post/power-bi-connector-intersystems-iris-part-i)). Next step: cool infographics. ![https://github.com/henryhamon/pop-song-analytics/blob/master/assets/pop_songs_analytics_1.png?raw=true](https://github.com/henryhamon/pop-song-analytics/blob/master/assets/pop_songs_analytics_1.png?raw=true) A bar chart to show the count of artists and a line chart for the average duration by year. A pie chart with the most common genres, for my surprise, contemporary country was the most popular genre. Has pop music gotten louder over the years? To answer that I use a Scatter plot with the average loudness by songs. ### The Pop Songs become less or more danceable? On the second page a bar chart shows how danceability changed by the years and a relation between energy versus acousticness. ![https://github.com/henryhamon/pop-song-analytics/blob/master/assets/pop_songs_analytics_2.png?raw=true](https://github.com/henryhamon/pop-song-analytics/blob/master/assets/pop_songs_analytics_2.png?raw=true) If you liked the idea, please consider voting for the [pop-songs-analytics](https://openexchange.intersystems.com/package/pop-song-analytics) [https://openexchange.intersystems.com/contest/current](https://openexchange.intersystems.com/contest/current) Special thanks to @Henrique.GonçalvesDias for the nice chat and support.