music video

Rails App to import Sofar Sounds music video metadata from an external source and exposes it externally

0
0
Ruby

Sofar Sounds Music Video Search API

This project is to satisfy the requirements outlined in the Sofar Sounds Backend Test
(https://github.com/sofarsounds/backend-tech-test), which are to import data from an
external file hosted on Amazon and allow it to be searched via an API.

To satisfy these requirements, I have built a Rails app (Rails version 5.2.3,
Ruby version 2.6.1) that exposes a simple index action enabling the searching of music
videos by song name, artist name, and/or city. To add some basic text searching support,
I used the FULL TEXT indexing functionality available within MySQL
(https://dev.mysql.com/doc/refman/8.0/en/fulltext-search.html). This indexing strategy
would need to be reviewed based on requirements such as expected growth of dataset,
uniqueness/distribution of words across data set, etc.

Dependencies

(On Mac only) Inst all Xcode tools: xcode-select --install

gorails has a great up to date setup guide, which covers a number of the remaining dependencies listed below:

  • Homebrew
  • RVM my preferred tool for ruby version management, but you can also use rbenv.
  • Ruby. This project was developed using one of the more recent stable versions of Ruby: rvm install 2.6.1
  • Bundler which is used to manage gem dependencies: gem install bundler
  • Ruby on Rails.
  • MySQL

Build and Run

  • Clone the repository locally into your preferred directory: git clone https://github.com/prangarang/music-video.git
  • cd into project cd music_video
  • install gem dependencies: bundle install
  • Create DB/DB Schema and seed the database with data from S3: bundle exec rake db:create db:migrate db:seed
  • Start the rails server: bin/rails server
  • Test API: curl -v -H "Accept: application/json" "http://localhost:3000/music_videos?song_name=reading&city=LoNdon"

Tests

To run rspec test suite run: bundle exec rspec spec/

Project Next Steps

Below are some additional things to focus on in future.

  • Importing
    • After understanding more about the data quality requirements, decide how to handle
      records missing specific metadata. Should it be skipped? Should it be flagged for
      review with external source provider?
    • If number of records increases in future, look into possible improvements such as
      processing/inserting in batches and how to work around potential issue that insertion
      into table with index is slower than adding index later.
    • If requirement to perform continual updates is required, update importer based on
      how updates are reflected in external data source (update in place or new row for each
      change).
  • Model
    • Based on knowledge of external dataset, consider making video_uid a unique key to
      support importing.
  • API
    • As usage of API is fleshed out consider releasing new version of API with different
      JSON response format.
    • Find out requirements for exposing records with missing metadata.
    • Expose a show action to support re-fetching of specific records by video_uid.
  • Searching/Indexing
    • Figure out requirements for commonly used words such as ‘the’ as this will likely help inform future indexing strategy/configuration
  • Tests
    • For tests requiring the search functionality, figure out a more elegant way to
      handle the fact that rspec tests run within a transaction and testing searching
      requires transactions to be committed. This appears to be the case for other searching
      solutions such as elastic search. See spec/support/shared_examples.rb for further
      explanation.
    • Figure out preferred directory organization of tests with Team