The video game, Grand Theft Auto V, with the help of a dedicated modding community, has managed to evolve far past its original intended experiences. Through a multiplayer modding framework called FiveM, users have managed to build large multiplayer roleplay servers centred around the GTA5 world. One of these servers, known as NoPixel, has attracted many roleplayers, streamers, and built a massive community following.
The MDW exists as a tool on this server for law enforcement roleplayers to perform their duties. It acts as a mobile terminal containing a database of all civilian profiles along with any incidents, reports, and evidence, that would be useful to government or police officials. This allows law enforcement officers to utilise a centralised database of information that can assist in their investigating, making and, reporting an arrest.
The MDW is only accessible to individuals who are actively on the server. RP Viewers are therefore restricted to only being able to follow along with what a specific streamer might show at any given time. And with over 200 roleplayers actively playing at once and the 1000s of characters overall, it can be very difficult to keep up with the world's ongoings. As such, the goal of this project was to extract useful data from the MDW, perform interesting metrics, and provide an external interface for other interested community members.
In order to build this database of MDW data, I first needed to collect the dataset. In this case, since I had no direct access to the MDW database, I had to resort to using alternative methods. Thankfully there are many VODs from roleplay streamers containing MDW perspectives. Therefore the only challenge here is how to efficiently detect and extract these frames.
First off, lets mention some of the constraints I imposed early on.
For the first problem of extraction, I needed a way to be able to stream VODs frame by frame. Luckily there is a very straightforward python library called VidGear that supports this exact use case.
I chose to have samples taken every second as it was a fair balance between frequency and computation cost/storage requirements.
Since we can assume that the MDW layout is proportionally the same no matter the content, we simply just need to look at the position where the left-most section header would be and perform an OCR detection.
This header will appear as 'Profiles' on the profiles page and 'Incidents' on the incidents page and as such can not only inform us if the MDW is on screen but also what page is currently being shown.
For OCR detections, I made use of the Tesseract OCR engine with the tesserocr python library.
To get the most accurate output, some image pre-processing needs to be done.
Tesseract works best with black text on a clean white background.
In this case, simply using a threshold to set all colors below a value to black and then inverting the image gives an ideal result.
Assuming a frame is validated as being an MDW frame, we then save that image and categorise them by the source VOD identifier.
Having them sorted this way makes it easier to compare and rule out outdated information later on (since VOD ids are sequential).
From the extraction phase we have a large collection of MDW frames and the type of page they are showing, whether it be a profile or incident in this case.
The program will now work through each frame and perform the actual extraction.
For now, we are only going to explore how the Profile extraction works but many of the principles I'll explain apply to Incident processing as well.
For reference, I will be referring to the different major sections as outlined below.
Each profile that is processed, tracks with it a sources file.
This associates the specific data entries that can be updated with a source frame and whether that entry is considered complete.
An entry is complete when the program can confirm it has read the entirety of said entry.
Having entries marked as complete allows these sections to be skipped over on repeated frames, which helps to reduce the overall computation time.
I broke sections down into the following associated source entries:
With the frame verified and a valid state ID associated with it, we can now perform detections on the profile content.
Starting with the center section, I needed a way to determine where the description box starts and ends as it can appear differently depending on its content and scroll position.
Since the description box has a darker colored background, it is possible to isolate it and perform a box detection algorithm to look for its extents.
For box detection I utilised an algorithm described here and modified it to best suit my use-cases.
https://medium.com/coinmonks/a-box-detection-algorithm-for-any-image-containing-boxes-756c15d7ed26
The input for this algorithm works best on high contrast masks with as little noise as possible.
By adjusting both the brightness and contrast, and with some thresholding you get a box mask that looks something like this.
Running it through the algorithm returns a list of all box-like shapes detected in the mask.
With some checks for minimum and expected widths, we can filter the list down to one image containing only the description area.
With the extents of the description box known, the Identity section can be assumed as being the remaining space above it, as long as that space is greater then a minimum size.
In cases where the section is scrolled so that only the description box is visible, the identity detection step will be skipped.
The identity section itself contains the profile's State ID, the persons name and a link to a profile image.
To extract this data I simply performed an OCR detection over the entire content and matched the headers with the content contained on the next line.
Since we already collected the State ID in the verification step, that header can be ignored here.
Compiling the different description detections into a single cohesive text is a significant challenge.
So I opted to sacrifice accuracy here for a solution that worked to an at least satisfactory level.
The solution I settled on was to look for matching lines in the source and new descriptions and define these as regions.
Regions that were found in the new detection then overwrite those that were matched in the source text with any new additions that they might include. Lines outside any regions are preserved as is.
In the end this method ending up being good enough. It suffered heavily from OCR detection inaccuracies since it expected exact line comparisons in creating the regions. And for any legitimate changes made to an existing line, the algorithm would either duplicate it if seen in a known region or completely ignore it otherwise.
To extract the profile details, I needed to separate the sections into individual images before I could do any extraction.
The box extraction algorithm is again very useful here.
With some processing to get a suitable mask image, running it through the algorithm returns each details section as a separate image.
From here I performed an OCR detection over the entire content to look specifically for any section headers.
This allowed me to not only determine what each section contained but also to prevent any box detection errors where multiple sections may have been incorrectly grouped into one image.
In some cases the Priors section content can extend to cover the full page, obscuring the header from view. Here I added a specific test to detect whether a subset of the visible tags are valid prior tags.
If the test passes then the full section is treated as a priors section and data is extracted as normal.
Otherwise, when no valid header is detected, the image is ignored.
Now with each section separated and its type known, another box detection is done to specifically extract the section's tags.
Each of these individual tag images then has its text read and interpreted into its valid format.
Upon the first detection, details sections are read and saved directly. On subsequent detections, a section is only updated if it is not considered complete or if the detection is from a new source. Section completeness is assumed when the next section is also visible in the same frame. In cases where the employment and prior sections are not complete and the section only has a partial detection then entries can be updated through the 'complex pathway'. These two sections were chosen explicitly because they are the most frequently occluded, and the contained tags are sorted.
With all usable data extracted from the profile, it is compared with any existing data and compiled into an individual JSON file. Storing each profile individually in an intermediate stage makes it significantly easier to perform corrections and allows me to easily track a ground truth profile source separate from any that might appear in the final database.
{ "StateID": 2151, "Name": "Emma Gaine", "ImageURL": "https://i.imgur.com/dNN4kwo.jpeg", "Description": "P#:4155122618\n\nMayor\n\nClass 2 License Issued 4/28/22 - Judge Cross\nSerial# 2151-iON-833 | serial: 2151", "Vehicles": [ "Camaro ZL1", "Injection", "Seashark", "Kalahari", "Blazer" ], "Housing": [ { "Name": "No3", "Type": "Apartment" }, { "Name": "Kimble Hill Drive 2", "Type": "Keyholder" }, { "Name": "Mirror Park Blvd 7", "Type": "Owner" } ], "Employment": [ { "Name": "Happy Yoga", "Role": "Owner" }, { "Name": "Homestead Bakery", "Role": "Owner" }, { "Name": "Little Paws Animal Hospital", "Role": "Veterinary Surgeon" } ], "Licenses": [ "Drivers License", "Hunting License", "Fishing License", "Weapons License", "Medical License", "Business License", "Oil Pump License", "Class 2 Weapons License" ], "Tags": [] }
-Sources.json- { "Identity": "0", "Identity_Complete": true, "Vehicles": "1488461857", "Vehicles_Complete": true, "Housing": "1488461857", "Housing_Complete": true, "Hotels": "1488461857", "Hotels_Complete": true, "Employment": "1488461857", "Employment_Complete": false, "Licenses": "1488461857", "Licenses_Complete": true, "Tags": "1488461857", "Tags_Complete": false }
-Log.txt- -----2022-06-27_19:00:23----- > Operating on '0/frame349320.png' < State Id: 2151 Updated entries: 'Identity' 'Vehicles' 'Housing' 'Hotels' 'Employment' -----2022-06-27_20:52:49----- > Operating on '1488461857/frame349260.png' < State Id: 2151 Updated entries: 'Licenses' 'Tags' 'Vehicles' 'Housing' 'Hotels' 'Employment' -----2022-06-27_20:52:51----- > Operating on '1488461857/frame349320.png' < State Id: 2151 Updated entries: 'Employment' -----2022-06-27_20:52:52----- > Operating on '1488461857/frame349380.png' < State Id: 2151 Updated entries: 'Licenses' 'Employment'
Following this stage, a final script is used to compile all the profiles into a single dataset.
This script also does some additional processing like separating out both properties/businesses into their own databases and merging any priors found from incident reports that might be missing in the profile due to not being detected.
And with that, the entire database is complete and ready for use.
Over a 2 month period of sporadic data collection. I accumulated just over 125,000 MDW specific frames from around 100 VODs.
After processing, 579 unique MDW profiles were extracted with over 2000 additional updates computed.
Additionally, around 850 complete incidents were extracted with a further 3000 incomplete incidents collected.
I built a website inspired by the MDW's design to present the data and allow the wider community to access it.
Check it out here:
https://h2n9.github.io/OMDW/