From Pixels to Tactics: Using Computer Vision to Data Mine for a Game AI

Description

Welcome to "The Art of Data Mining: Scraping Image Data for a Game AI using Computer Vision Techniques with Teamfight Tactics." In this exciting video, we delve into the world of advanced data mining and explore how computer vision techniques can be employed to scrape image data from the popular game Teamfight Tactics.

Join us as we unlock the secrets of extracting visual information from Teamfight Tactics using cutting-edge computer vision algorithms. Discover how this process can be a game-changer for developing powerful AI models and enhancing your gameplay experience.

Through step-by-step demonstrations, we'll guide you through the intricacies of image scraping, showcasing practical techniques to extract valuable data that can be utilized for various purposes, including AI training, statistical analysis, and visualizations.

Learn about the underlying principles of computer vision, explore the tools and libraries used for image scraping, and witness firsthand how the art of data mining can transform your understanding of Teamfight Tactics.

Whether you're an aspiring game AI developer, a data enthusiast, or simply curious about the fascinating world of data mining, this video is a must-watch. Expand your knowledge, sharpen your skills, and embark on an exciting journey into the realm of scraping image data with Teamfight Tactics.

Don't miss out on this opportunity to uncover hidden insights and take your gaming experience to new heights. Join us now and discover the limitless possibilities of data mining with Teamfight Tactics and computer vision techniques.

This part 5 of my series on coding an AI to play TFT. This teaches Computer Vision techniques in JavaScript and TypeScript. Topics covered: OpenCV, TensorFlow, Linear Algebra, Screen Scraping, Screen Capture, Image Cropping/Scaling, Image Similarity Comparison, Image Search, Debugging

Transcript

hey I'm coding a bot to be a strategy game and I'm documenting it so we can learn together along the way this part of the series is on computer vision by the end of this video you'll be able to scrape the important elements out of an image like this into simple variables for the rest of your app's code to use in a future video we'll send those variables to an AI model so it can make the best in-game decisions possible to win the game if you're new here my name is Devin and this channel is for coders interested in gaming hacking and AI let's get into it there are several approaches to extracting information out of a game computers don't have eyes like we do so they can't just visually see what's happening in the game but without this information the AI can't make decisions in a previous video I showed how to extract text out of an image on this screen the important text elements could be the round number how much time I have left how much health I have and on this other screen there are many more text elements such as the characters that are available to purchase from the shop how much gold I have what level I am how much experience I have Etc unfortunately there are some data elements that the AI needs which are not available as text for example this little fire icon tells me if I'm on a win or loss streak and that affects how much bonus gold I get after each round so I may want to intentionally keep losing if I'm already on a lost streak or I may want to spend all my gold to keep winning if I I'm on a win streak but because this is an image with no text I can't use the techniques from the previous video in addition on this screen there are icons which represent the item I will receive if I pick that unit but this screen is more complicated because as you saw previously the units with their items are rotating around in a circle so we'll use two different techniques for scraping these images one for this fire icon because it's in a static position we always know the X and Y coordinates we just need to determine if it's blue red or gray and a separate technique for these items because they're moving around the screen and we don't know their location now the actual 3D model of the character would be useful to scrape so that we know which character we're selecting but that's much more difficult because the 3D model is animated so as it walks around in a circle it's turning and also animating footsteps whereas these items are just a static image although we could try to analyze the 3D model we're going to use the easiest technique for each approach and in this case I can instead click on this eyeball which will pop up an overlay like this on the side of the screen and from this overlay I can just read the character's name using the text recognition algorithm I showed in the previous video so there's no reason to code a complicated algorithm to analyze the 3D model if we don't need to now one other approach that is common is to actually read the memory of the game while it's executing because in the game's Ram it knows what characters it's rendering on the screen and where and so if we could find those variables we could get the data that way I've chosen not to use this approach simply because I know Riot has anti-cheat mechanisms in their code so each time they patch the game the memory locations are going to change and I don't want to have to recode my AI constantly but I know the visuals are unlikely to change because those are the most important for the player experience and scraping the image is not incredibly difficult as long as we're not dealing with 3D models so let's start the code that can scrape this fire icon out of the game here I've opened Visual Studio code and I have some boilerplate code pasted in this is actually the project I started in the previous video so as a reminder low Dash has some useful utility functions that are common in any JavaScript program opencv is for computer vision so it will help us analyze images and tensorflow is for neural networks however it also has a lot of Matrix math formulas and images are essentially matrices of colors so some of these math formulas will be useful FS path and Os are just utility libraries built into the node.js platform that will help me access files off my hard drive and in this case I've extracted screenshots out of my gameplay so that I can quickly make changes to my code and retest without launching a brand new game and playing to that specific screen for these Imports to work low Dash opencv and tensorflow I had to open the terminal and I had to type npm install and the package name those packages are hosted on the npm JS website and the npm install command modifies my package Json to add that package as a dependency along with a version number a package is essentially a library of code that I can use those packages are downloaded into this node modules folder inside of my project and you'll notice hundreds of packages even though I only picked a few and that's because the packages I installed may depend on other packages but for example under the at u4 I can see the opencv node.js that I imported so here's the code I don't need to read any of that code when I type the import command JavaScript knows to find the code and allow me to reference it this way I don't have to reinvent the Wheel by coding my own computer vision algorithms I can use the ones from this open source library that someone else has already coded for me okay let's get started coding I'm going to make a function called R Images similar and you could call this whatever you want and it's going to return a Boolean meaning true or false the purpose of this function will be to take one of these two icons the blue or red streak and compare it to this image to see if it's a match however I'm not going to compare the entire image instead I will crop just this portion of the image and compare it so I need to pass the images into this function I'll call it image a and it will be of type computer vision Matrix which again an image can be represented as a matrix I'll have a second image and I'll also pass in a parameter called threshold which is essentially the percentage match that I require to consider the images similar this will allow me to fine tune my results so that if the algorithm is not working I can increase or decrease the percentage of what is considered a match the algorithm I'm going to use for the r images similar function is called mean squared error and let me explain the math behind how that works before I code it let's pretend I have three images each image could be represented as a matrix of colors so these images have two pixels wide and three pixels tall and I'm going to open up a program I like called color cop to explain how pixels are represented in the computer so this first pixel on the top left corner of the first image is 255.00 the first number is the red the second is the green and the third is the blue so I have full intensity on the red Channel and no intensity at all on the green or the blue 255 is the maximum number and that's because colors are typically stored as 32-bit numbers each Channel uses eight Bits And if you look at binary in a calculator 2 for binary to the eighth power meaning eight bits is 256. and that gives us a maximum of 255 because computers count from zero instead of one so the range from 0 up to 255 has 256 different combinations so this number 255 if it were represented in binary would be eight ones now that gets cumbersome so the alternative is this hexadecimal notation so the hashtag just means that it's a color the FF is the red Channel and it's equivalent to the eight ones the next zero zero is the green Channel and the last zero zero is the blue Channel and I mentioned that colors are 32 bits since red green and blue are each eight bits eight times three is actually only 24. so if I add 8 more it becomes a 32 the final eight bits are for the alpha channel the alpha channel is for transparency so if I look at a screen for a video game you'll notice that the game is rendered in layers you have the background which is underneath everything which is this landscape that the characters are walking on and on top of that is rendered the 3D model of the character and on top of that is rendered any UI elements like this damage number this Health number this health bar or the heads up display that gives the player information about the game you can see how they're all rendered in layers the transparency is necessary so that it can blend the images together for example on this number 160 you can see that the zero is hollow it's transparent on the center so that it blends with the background and you can still see the background bleeding through underneath so transparency is very useful when you're rendering Graphics but since we're taking the final rendered image and analyzing it we don't need transparency so we're just going to be using 24 bit colors so that's how an individual pixel is represented and an image is just a grid of pixels so in this case two pixels wide and three pixels tall so if I opened up paint and changed it to two wide and three tall then I could draw these red pixels so I have one pixel on the top left corner one on the top right here's my second row and here's my third row so that's my exact little image which is being represented here so this second image would be almost identical except the intensity of the red would be very slightly less and this third image instead of being red is going to be green so if we open up color cop we can see how the colors mixed together for example if I take full intensity red and mix it with full intensity blue I get kind of a purplish color but if I reduce the amount of blue then it's a bit more pink but but if instead I have full blue and reduced the red it's a bit more blue so how am I going to determine if these two images are similar if I just wanted to know if they were identical I could compare each pixel one at a time and within the pixel I could compare the intensity of each Channel red green and blue the problem with that is it's going to be rare for images to truly be identical for example in a game you may have glowing effects where an icon shimmers or shines for a second to catch the user's attention and so if I take two different screenshots that are a second apart they may not be identical but they should still be similar and also different Hardware May render Graphics slightly differently for example really old Hardware would not support 32-bit color and Gamers play at different resolutions so that's why we have this threshold parameter is to Define what percentage of similarity we want to consider close enough and that's going to depend on our specific use case and what we're trying to accomplish so the way the mean squared error algorithm works is images that are pretty similar like one and two it will calculate a very small result and images that are very different like image three it will calculate a very large result so that way we can easily distinguish between images that are almost the same compared to images that are totally different and the way it does that is first it calculates the error which is going to be each Channel and each pixel just subtracting the numbers to get the difference between them the Delta so for comparing images one and two the error would be a brand new image that has ones for the red Channel and zeros for the green and blue because 255 and 254 are one off from each other next it's going to square them so this would be the squared error so 0 times 0 is still zero and one times one is still one so in this case the squared error is the same as the original error but let's just pretend for a minute that this number was 250 instead of 254. that would make our error be 4 and our squared error would be 16 because 4 times 4 is 16. so you'll notice that only making them off by three additional values actually made the squared value go up a lot so that's the benefit of squaring it is numbers that are similar stay low but numbers that are far away get really big so finally we're going to get the average or the mean by adding up all of these numbers together and dividing by the counts so I have six total pixels and each pixel has three numbers so that means I have a total of 18 numbers and I'm going to add 16 and 0 and 0 and 16 and 0 and 0. so if I add up all of these 18 numbers I get a total of 96 and then when I divide it by 18 I get 5.3 so that is the mean or average squared error finally I want to convert this to a percentage so if I consider the highest possible mean squared error which would be essentially if I compared an image with all zeros or black to an image of all 255s or white my mean squared error would become 256 times 256. so I'm going to divide this 5.3 by this number to get a percentage of how different they are so I'll get a number between one percent and a hundred percent so in this case it is extremely small all and so I could have a very small threshold and consider those numbers similar so let's compare image 3 to image 1 to see how different our result is so in this case the error for the red channel is 255. the Green is 0 and the blue is an additional 255. so for our squared error the red would be 255 times 255. the green would be zero and the blue would also be 255 times 255. so you can see how that would be a very large number and then after that's averaged it becomes 43 350. so to convert that to a percentage we would divide it by that same normalization factor of 255 times 255 and we get 0.6 meaning 60 percent but if we compare that to our previous result with the very similar images 5.3 divided by our normalization Factor gives us .0008 so that's not even one percent so you can see very similar images get a very small number very different images get a very large number now 0.6 doesn't sound like a large number but the maximum is one meaning a hundred percent hopefully that made sense let's code this up so I'm going to start by taking image a and calling the ABS diff function which is part of the opencv library and pass it the image B this stands for absolute value difference so the difference is going to subtract the channels from each other red minus red from each pixel green minus Green from each pixel Etc and the absolute value will convert all negative numbers to positive the reason for the absolute value is so that the order the images are subtracted from each other doesn't matter because if you consider that the first image maybe has five for its red and the second has four for its Red If You swapped the order four minus five you're going to get a negative one instead of a positive one so the purpose of the absolute value is to make sure that the result is the same regardless of which order you subtract next I need to square the difference but unfortunately opencv does not have a method for that but tensorflow does so I'm going to call CV to tensor on this diff image and I coded that up in the last video where we did OCR or text recognition so you can go back to that video if you need the code for this method but just know it's just converting from opencv Matrix format to tensorflow Matrix format so that we have access to the tensorflow apis so now that it's a tensor I have access to the square method and after I Square it I can do the average so now we have our mean squared error in our result variable however it's not normalized to a percentage from 1 to 100 so to do that I'm going to take my result variable and it's still inside of a tensor which is a matrix even though it only has one value inside of it so I'm going to call the data function to convert it from a a tensor to just a normal JavaScript number the data method has an asynchronous version and a synchronous version this is because if you're dealing with a really large Matrix the math that we executed earlier the square and the mean could take a long time and the way tensorflow works is the commands you send it are queued up in a chain and they don't actually execute until you extract the final value so I'm going to call the synchronous version just so it keeps our code simple I explained asynchronous code with promises or the await command in the last video and you could use that method as well so the data sync method will give me the Matrix as a JavaScript variable and it's going to think that it's an array because that's what matrices are but because I did the average I know I only have one single value so this is one example where typescript gets confused so I'm going to cast it to the any type and then immediately afterwards cast it to the number type to correct that bug in the typescript compiler it's probably not really typescript's fault it's probably whoever made the type definitions for tensorflow didn't label this method correctly as you can see it says that it returns an array but it could technically also return a number now that I have the number I'm going to divide it by our maximum error which is 255 times 255 so that will give us a percentage from 1 to 100. so I'll call that a normalized result and then since I wanted a true or false I'll just return this value compared to my threshold so as long as this value is smaller than my threshold I will consider it similar which will be true I'll return false if it's greater than the threshold because that means they are not similar okay that's all we need for this function I'm going to put a break point on this comparison because the first time I test this I want to see what kind of percentage I get for a matching image so I know what value to use as a threshold so let's test out this function so inside this test code I'm just going to call our images similar and I need to pass it to different images and for the threshold since I don't know what to put I'll just start it at zero so the images I want to compare I'm going to take this Blue Streak image and I'm going to look for this blue fire as well as red fire at these screen coordinates so because I'm going to need to crop this image to get the correct locations I'm just going to put my mouse on the top left corner of the icon and on the bottom left corner of my screen you'll see the X and Y coordinates within the image then I'm going to draw a box around it and that gives me the width and height as well so to read that image I'll use the opencv IM read command and I'll paste in the path to my image and I need to tell opencv what mode to read the image in so I'm going to specify that I want color because some computer vision algorithms don't need the color and they work off of grayscale next I'm going to take the Blue Streak and red streak icons and load them up as well Blue Streak equals I am read paste in the path and use color same for the red streak I'm going to move this comparison image to a variable called game image and I will crop it before I compare if they're similar so let me call Crop image and this method I coded in the last video about tesseract's optical character recognition so you can review that video for how that's coded I'll take my game image and the coordinates I found in paint so now my game screen has been cropped to just the correct coordinates and I'll compare that to the Blue Streak image okay I can execute this with the npx command TS node dot slash index.ts npx is npm execute TS is typescript so it first will compile from typescript to JavaScript and then it will use node to execute the file and actually rather than adding a breakpoint here I'll just log whatever the result was from running my calculation so I've got my two comparison icons I have the Blue Streak icon and the red streak icon and then for images I have a board game with a Blue Streak on it and then for in-game screenshots I have one with a Blue Streak which I can crop to compare and I have one with a red streak which I can crop to compare and then I could take an image that's not the main board screen so that if it crops that same section it will get something completely unrelated and we can use that to see how well our algorithm works so I have my two icons loaded into variables for my game screens I've got one example which should have a blue icon one example that should have a red icon and one example which should be completely unrelated let's try first with the blue game screenshot against the blue icon and see what kind of result we get run it I got 16 match now that seems a bit worse than I would expect however if I look at my comparison icon and I compare it to the in-game screen when I crop I'm cropping a square even though the icon is Round And so it's actually comparing the background around the icon which I would prefer to exclude so I could improve this algorithm by taking my comparison icon changing all the white to transparent and then changing my mean squared error to ignore any pixels that are transparent because the difference it's giving me is probably the fact that the background is purple and it's comparing purple to the white because the rest of the icon should be the same and that could be different for every game of TFT because the background is a skin that the player selects so that Alpha could be important now I'm not going to code that right now the reason why is there is a different image comparison algorithm that solves some of these issues that means squared error has so I'll leave that for an exercise for you if you want to use mean squared error to code that piece to ignore the transparent pixels in your Source image but let's just verify that in general this is working so now I got 16 if I instead compare it to the red icon so my game image is the screen with a blue icon and a purple background and I'm comparing it with a red icon and a white background hopefully I should get higher than 16 percent okay 18. so it is a worse match so in both cases it seems to be indicating that there is a fire icon there and if I wanted to know whether it's blue or red said I should just compare against both and whichever value is smaller is the one I should keep because it has a closer match to the blue than it was to the red now let's compare it to this Carousel image which should be completely different and hopefully we get a very high number meaning that it's not similar at all run it okay 0.34 so it got to about double so if I were to give a default value to this threshold I was just passing in zero maybe I'd say something like anything point two or lower is considered similar and anything higher than 0.2 is not But whichever one gives a better threshold between the blue and the red is the one I want to keep if I coded the transparency portion I would expect instead of 16 and 18 I should get very close to zero so that would be an important way to improve this code the reason I'm not going to code that is this was just to demonstrate the basics of image comparison and I'm going to show a much better algorithm in a second so the issue do with that comparison mean squared error is not only do I have to deal with background issues but also if there's any kind of effect for example if this icon was glowing to get the user's attention some kind of Animation that would also affect it also screen resolution and possibly differences in hardware and so even though it can work it's temperamental and most importantly it requires that we know the exact X and Y coordinates on the screen but I mentioned previously that with this Carousel screen we're still wanting to search for an icon but that icon is moving around the screen we don't know the coordinates so let's code up a separate function to help us search for an image I'm going to modify this test code slightly I'm going to keep the Carousel and I'm not going to crop it yet but I'll remove these other two game images and instead of the blue and red streak I'm going to get two items so if I look at the carousel screenshot I see for example here an item called tier of the Goddess so here I have a copy of all of the icons and I'm just going to load this tier of the Goddess icon into a variable so paste in the path rename it to tier now whenever you're testing your code you want to have one example that you know is going to work and one example that you know is going to fail so that you can try both scenarios and make sure it really does work and fail to prove that your code is really working so for my bad example I'll just look at this Carousel and I'll pick any item that's not shown here if I look at this item blue buff it is not in this screenshot anywhere but it's a good example to compare against tier because they both have a lot of blue so I want to see if this algorithm can distinguish between the two so I'll put this blue buff into a variable okay I've pasted the path in and renamed my variable so instead of checking if images are similar instead I want to code a new method called search for image locations this function will take a template which will be my tier of the goddess or blue buff item and that will be a computer vision Matrix or an image and I want to search for that in a background image which will also be a computer vision Matrix and again I want some kind of threshold but I don't know what value I'll use yet and I want it to return an array of rectangles so a rectangle will have an X and a y coordinate for the top left corner and then it'll have a width and a height so that will tell me the exact square that it found that image so this method is actually quite simple because opencv has a built-in method for this it's called the match template function and I call it from my background image I pass in my template image and I tell it which algorithm to use you can experiment with these different algorithms to see which works the best but I prefer the squared diff normed compared to the other ones this one seems the fastest and it seems equal really accurate compared to the others now I'm not going to go into depth on how this method works but if you want to Google it the term is called a fast Fourier transform or fft and it comes from a French mathematician and you would study it in a linear algebra class but it's a complex math formula on matrices that in our case helps us search images so I'm going to put the results of this function into a variable and the result is a matrix which we could think of as an image but it's not an image you would typically look at instead it's the calculation at each pixel of How likely that pixel is to be the location of this template for example let's say I had an image like this if my template was this teal puzzle piece like this then the resulting Matrix would be something like this where the top left corner is 100 match and all three of the other Corners are a 0 zero percent match however we have to remember that these are pixels so if we show grid lines here each of these squares is a pixel so in reality the top left pixel would be a hundred percent match but the center of this teal would only be probably a 50 match because it's telling you how likely the template is to start at that location and so if it started in the middle of this teal then the teal would bleed down into the red and bleed over into the orange and so after we run the fast Fourier transform we can scan the Matrix for a cluster of really large numbers and use that to make a good guess as to the location of our search image and once we've determined that we can use the mean squared error algorithm that we coded previously to double check that we really found the correct image so the benefit to the fast Fourier transform is in its name it's very fast because think about about if we were to use the mean squared error algorithm to search this entire image for this teal image what it would have to do to search for this teal puzzle piece in this entire image if it doesn't know the location ahead of time is it would have to check every possible location and run the mean squared error so for example it would run one time with this top left coordinate then it would run again with this next coordinate over and then a third time with this next coordinate so if you think about the size of the grid you've got hundreds of coordinates you would have to check so if each individual mean squared error algorithm takes a second and you have to run hundreds of times it's going to take way too long in fact if you think about a 4K image which is fairly standard nowadays for gaming you have 3840 wide by 2160 tall that's the number of pixels in a 4K image so you would have to run the mean squared error algorithm 8 million times to search one image now computers are fast but running a calculation eight million times is probably going to be slow I tested the mean squared error algorithm and on my machine which is a top of the line fast computer today it takes 200 microseconds or two tenths of a millisecond to run a single mean squared error check and it gets worse because not only would we have to do 8 million times for a single 4K image that's for each image that we're searching for so on this Carousel screen there are multiple icons if I search for the tier the goddess icon 8 million times I'll only find just this one separately I have to search for the Giant's belt 8 million times and separately I have to search for the recurve bow eight million times and there are 54 different items to search for and because these items are moving around in real time if I want my character to move to pick up the item I have to keep searching again to know their new location as they're moving to keep my trajectory so it's just not feasible if you were to take the 8 million searches times the 54 items and that's 200 microseconds each there are a million microseconds in a second so that would be 90 000 seconds there are 3 600 seconds in an hour so that would mean it would take 24 hours to search this Carousel image if it's in 4k for 54 different items now obviously that's not going to work the game's gonna be over in 20 minutes and the screen is going to be over in 10 seconds we can't take 24 hours to analyze the image so there are little micro optimizations we could do for example rather than searching this entire image we could crop it to just the center section and that would help and in addition instead of searching the image in 4k we could scale it down small to do the search and then once we think we found the correct location we could scale it up big again and do a double check just on the exact coordinates we found but those type of Micro optimizations they might I beat it up for maybe 24 hours up to maybe one hour but that's still not going to be good enough in that situation we really just need a brand new algorithm and that's why we're going to use the fast Fourier transform and another benefit to fft is it also is more resilient to minor changes in the image so if the image has been rotated slightly if it has a small Shader effect like a glow as long as it's still pretty close to the image we're searching for it should still find a match so as I mentioned before this would be how the fast Fourier transform results would look except that in our example each puzzle piece isn't one single Pixel it's actually a lot of pixels so instead of looking like this it would actually be something like this where the top left pixel is a 100 match the pixels around it are a 90 match the pixels around those are an 80 match so as my cursor gets further away from the correct position for the teal puzzle piece it's less and less of a match another way you could visualize this is on the carousel screen after running the fast Fourier transform if I search for an icon that does not exist anywhere the carousel screen kind of turns to grayscale but there's no obvious indicator of where this icon exists because it couldn't find it it should turn black when it finds it and the darkest we get is kind of a gray and it doesn't really look like the carousel screen anymore but you can see a vague resemblance whereas if I search for an icon that does exist you'll notice there's a distinct black bot on the image where it was found and as I mentioned we want to search for kind of a cluster of black dots all next to each other because if there are a lot of pixels close to each other that all have a high match percentage that's a good indicator that it found the image whereas if you just had a single little black pixel by itself that's not a strong indicator and ideally the size of the cluster of black pixels should be roughly the size of the search image that you're looking for so that's how this match template function works and what's really cool about it is when I tested with a cropped version of the carousel so just analyzing this Center section it was able to search for a single image in about 80 milliseconds and if we're searching for 54 different icons that's 4 000 milliseconds well there are 1 000 milliseconds in a second so that's essentially four seconds so we're getting from one hour as the best case scenario in the other algorithm down to four seconds with this algorithm and it would be better if it could be less than four seconds but I think that's fast enough that the AI could make a decision and we could probably do some micro optimizations to make it a little faster if we needed to the easiest would probably be nowadays computers have multiple cores on my IPC I've got 16 different cores each of these can run a different method at the same time so since I have 54 images I'm searching for I can let each core search just four of the images then all 16 cores will be fully active at the same time multitasking and so then instead of four seconds it should only be a fourth of a second because I can divide it by 16 cores and that would be plenty fast so let's finish coding this up now that opencv has given me the results I just need to search for an area with black pixels and then afterwards double check those locations with the mean squared error because although mean squared error is slow when you're running it across millions of locations since this will narrow it down to very few locations it should be fast to just check those few locations in order to iterate through my Matrix I'm just going to make two for Loops that are nested the outer loop will iterate over the columns and the inner loop will iterate over the rows of the image so I'll use variables X and Y this means in this Inner Line of code the variable X and Y will be the exact pixel of the image I'm looking at and it will run for 4K image 8 million times 1 over each different pixel in the image and that's okay even though it's iterating a lot of times because these types of programming commands are very fast and the computer can handle checking a single pixel eight million times it's running the mean squared error 8 million times that would be an issue because mean squared error is doing a lot of really complex math so to check the value at the current pixel I can use the at function on The opencv Matrix and I'll pass in my coordinates so that's my current value and then I'm going to check if it's smaller than my threshold and if so I'll consider it a possible match so I'll make an array of all my possible matches and and if the value is less than my threshold I'll push a new value onto the array for that possible match now remember I'm using the rectangle structure which has an X and a y coordinate as well as a width and a height so I'll make a new object with X and Y as well as width which is my search templates width and height which is my search templates height and because it's a matrix not an image then it uses the terms rows and columns instead of width and height and opencv uses the terms columns and rows instead of width and height and finally I'm going to pass my value which is the percent match so now that I have a list of which coordinates I want to consider I'll Loop over these possible matches and check for a cluster of possible matches that are close together by combining a cluster into one single match so the way I'll consolidate this cluster of black picks into one single match is by removing the duplicate gray pixels that are around it so if I zoom in to these black pixels and remember this is a matrix and I don't know without doing some math which point is the darkest what I can do is sort by the darkest pixels which would give me the spot in the center and if I know that this sword icon that I was searching for is 50 pixels wide by 50 pixels tall then after finding the darkest pixel I can draw a 50 by 50 square around it and say any pixels within here that have a smaller value can be removed as duplicates that way if I have two different dark spots I can consider them both to be a match but if the dark spot contains 100 black pixels I can just keep the best match within those 100 pixels instead of checking every one of them and then after I've narrowed it down to just the darkest pixel within each cluster then I can run the mean squared error on each of those as a final check that I've really found the correct spot so to do that first I'm going to sort my possible matches using the order by function of low Dash and I'm going to sort by the value which is the percentage match and remember we were checking for values less than the threshold so that means the smaller the value the closer it is of a match for example 0 is black which is a close match one is white which is not a close match and this is sorting ascending so the black pixels will be at the beginning of my array and the Y pixels will be at the end of my array and I've already narrowed it down to just values under my threshold so I'm not considering that many pixels anymore I'm really only considering the black and dark gray now that it's sorted I'm going to make an array to store the duplicates and I'm going to Loop over each of my possible matches and I'll call the variable a because I want to do an A B comparison between two different pixels so you see if there are any gray pixels that are within the boundary of a black pixel because then those gray pixels can be removed so the B values will be the gray pixels and the reason I know that is since I've sorted the first elements in my array are the darkest so a is going to start at the darkest pixel and it's going to Loop over the entire array but B I'll have it start at a plus one so it's only comparing pixels that are less dark so I'll put the first match I'm comparing into a match a variable and the second one into a match B variable and since I already know that b is not as dark then if it's within the same region as the first pixel I'll consider it a duplicate and the size of the region will be the size of the search image or template so so that's basically the size of the item I'm searching for so to do that I'll subtract the coordinate from each pixel and see if the distance of how far away the pixels are is less than the width of my search image that would mean they're close enough together to be considered the same region and so they're a duplicate and I'll do a math dot absolute value on that subtraction so that the order of which coordinate is subtracted first doesn't matter so I don't have to consider negative values and I'll do the same thing with the height so if the second pixel which is not as dark is both within the same vertical and horizontal region as the first pixel which is darker I will consider the second pixel to be a duplicate by pushing that match into the duplicate array so now that I have a list of all the duplicates I can get my final set of matches by using the low Dash difference function which uses set logic onto arrays to remove any values from the first array which exist in the second array so my initial array is all of my matches and I'm removing any values that are in my duplicate array now that I've removed all of the duplicates I need to double check with the mean squared error to remove any matches that were incorrect so I'll put these into the verified matches variable and I'll return it at the end of my search function and all I need to do to test each of these matches is to call the filter function now filter comes from functional programming so it will Loop over every element in the array and pass each element to an inner function the parameter to the inner function will be the element of the array that I'm currently looping over and after the arrow comes the body of the function this function needs to return true or false based on if I want to keep that value in the final array so I'm going to call the r images similar function that I previously coded and since that gives me true or false if I return that it will only keep the matches where mean squared error says the images are similar so I'm using fast Fourier transform to do the initial search which is mostly accurate but it's also fast and then I'm double checking those results with mean squared error which is more accurate but it's not designed to search a large image it's only designed to compared to images so in order to call this function I first need to crop my Carousel to the right coordinates so the background image which was passed in to my function is the carousel image I'm searching and the crop location is the current match that I'm looking at so I have match.x match dot y matched up with and match dot height now that I have my cropped image I can compare it to my original template which is the item icon I was searching for and then we need the threshold value so as we discussed before with this R image is similar we needed to use a 0.2 for it to work because I was looking at an icon that had transparency in its border and the background that was bleeding through was causing a poor match but all of my items are square none of them have transparency so that won't be an issue and because of that I can actually use use probably a four percent similarity threshold because it'll be a much closer match the value can be much smaller now I'll need to do testing with this match template function to determine what threshold works well for it because it's a different algorithm than mean squared error so it may be in a completely different range and the issue with that is I have a threshold being passed into search image location but I need to also pass a threshold into our images similar but because they use different algorithms their threshold is going to be at a different scale anytime you're comparing values of two different algorithms that have a different scale you need to normalize the values to convert them to be the same scale so I've already done testing with this I already know that if I use the same four percent threshold that mean squared error needs then it will not work with the fast Fourier transform because the threshold will be too low so I'm going to normalize it by pasting in a constant variable now now the value I found that works best is about 0.25 but I'm putting into a constant variable so that it's easy to modify the feature as needed and I'll just take the threshold that's passed into my function and divide by this ratio to get a threshold that works well so for example .04 divided by 0.25 gives us 0.16 and I found that with the fast Fourier transform and square diff normed algorithm Point 16 seems to work well in my case of searching for items on the carousel now I can pass the threshold into the r images similar and this code should be done so let's execute our test function we've got our Carousel image and we're going to pass that to the search for image locations as the background that we're searching within and then we'll pass in tier of the Goddess as the template we're searching for which I expect it to find afterwards we'll test with blue buff which it should not be able to find and I'll put the results into an image locations variable and I'll output that to the console so that we can see how it worked json.stringify takes a JavaScript variable and converts it to Json format and Json is designed to be both human and machine readable so it's useful for logging messages in particular when you have values that are complex so in our case image location is an array but inside the array our rectangle objects which have an X Y width and height so this allows me to print all of that to the screen really easily for debugging so let's open up the console and run our code okay I got an error index out of bounds calling the at function on a computer vision Matrix variable I realized the mistake I made here CV wants the Y value passed in first and the x value passed in second that was my mistake okay now that we've got got that error fixed let's try one more time okay we're not getting an exception but it's not finding the image anywhere so let's see what mistake I made I mentioned earlier that fast Fourier transform can handle slight discrepancies as far as if the image you're searching for is slightly rotated or if it has a minor Shader effect like a minor glow but we still want to minimize them as much as possible because we will need to increase our threshold the more effects there are that are making our images not match and in this case there are not any effects however the images are not scaled quite the same so if I were to highlight this icon you can see it's around 30 pixels wide and tall but the items I have here for example tier the goddess when I open it it's 64 by 64. so the icon file I have is of higher quality and larger than what's being rendered on the screen so I think if I reach size my icons to match the size on the carousel it will work better so opencv has a resize method so I'm just going to pass in 32 and 32 to match the Carousel and I'll do that both on tier the goddess and on Blue buff and I think that will work better let's try okay perfect so uh this gave me only one coordinate where it says it found the tear of the Goddess which is 859 and 553. so let's open up our Carousel and as I move the mouse around the bottom left corner of the screen here should show the coordinate I'm at so if I go to the top left corner of tier the goddess I get let's see 856 551 859 553 so last spot on I it was off by a couple but I think that was me not holding the mouse in the correct position I think the code did perfectly and you'll notice here the value 0.06 so it gave a quite a low threshold which is great now that seems wrong because I said it has to be 0.04 or or smaller it's actually fine the reason is I normalized the value when I compared it but I did not normalize it when I stored it into my match results so that's a little bug I could fix but it doesn't really matter it's just a debugging thing it doesn't affect my results so let's try it now with blue buff in this case I hope it does not find any image at all because it should not be on the carousel screen okay perfect it didn't now I am curious if I didn't run this double check with the verified matches I wonder because they both have a lot of blue in them if it might still return it I don't know if it will or not but I'll just comment out this section temporarily and just return the matches without verifying them just out of curiosity okay even then that's really great news so that shows that with the fast Fourier transform even though blue buff has quite a bit of blue like the tier the goddess has they're different enough that the fast Fourier transform does not get confused and I think that's because the Fourier transform is actually able to recognize shapes and and intensities as well so I'd say the intensity of the colors are quite similar but the shapes are very different and the blue buff also has kind of this brownish stuff on the edges so it's good to know that I didn't need to verify the matches but it doesn't slow down the result much and I think it's nice to just double check it so another place the fast Fourier transform might be useful would be for searching for these blue question marks because they get dropped in random spots on the screen whenever you kill minions and you need to right click on them in order to open them up and they give you a random bonus so I don't need to code anything additional there if I wanted to do that I would simply change from the carousel screen to this blue item drop image and then instead of this tier of the Goddess PNG I would pass in this TFT game blue item drop image now this would have the similar issue as the blue and red fire before where I would want to make sure I excluded the alpha Channel around it when doing the comparison but otherwise it should work fine so that's the beauty of these two functions we've written is they're generic they can be used for any part of the game as long as you give it the correct template image and image that you want to search so between this and the Tesseract OCR that I demonstrated in a previous video we can extract virtually anything that we need to out of the game while it's being played so now that we have that we can send it into our AI model to ask it what decision we should make each round let me know in the comments below if you have any questions about computer vision techniques in future videos I apply to cover how I'm gathering match history from online TFT games that I can use to train an AI model with as well as how to use neural networks to take all the complicated statistics in TFT and analyze them to determine which decision is best to make during any given game state and finally how to remote control the computer to simulate keyboard and mouse instructions as if it were a human so the game can play itself hopefully those topics are interesting to you if there's other topics you'd like to learn about let me know in the comments below stay tuned for future videos and thanks for watching

Search This Blog

YT_AIGamer

From Pixels to Tactics: Using Computer Vision to Data Mine for a Game AI

Popular posts from this blog

AoE4 Mod Tutorial: Making a Crafted Map in the Content Editor

TeamFightTactics - Rules & Strategy - Coding an AI - Part 2

Intro to JavaScript - Coding a TFT AI - Part 3