Creating an Open Tetris Database

Thread in 'Research & Development' started by colour_thief, 13 Oct 2016.

  1. I was larking about with that rough playfield/capture tool before a busy period snuck up on me, I keep meaning to revisit it and clean up a bunch of stuff; might be something useful in there, even if just for prototyping:

    Link (because apparently I keep messing up embedding).

    The detection method/points, render method are all swappable (there's multiple region capturing in there as well).
     
  2. I only had a short glance at steadshot's code, but from what I've understood he's doing a luma thresholding to detect the presence or absence of a block. Couldn't we use the color information to get a better block detection ? The algorithm would be something like

    1. Segment the playfield into a 20x10 matrix
    2. Average/gaussian blur the whole thing by a sufficient amount
    3. For each tetramino type, If the HSL value is within range, write piece
    4. Else, write blank

    The background being darker than the pieces, the luma value would but out of the accepted range.
     
  3. Muf

    Muf

    You can subtract the background out of the stack if that helps, it's a known variable for each section.
     
    Kitaru likes this.
  4. Yeah, that's a great idea. It's a bit like using XOR between two bitmasks for pixel perfect collision detection, eh? Subtraction is probably better for video that might not be pixel/color perfect -- then I guess you can go ahead and elide the cells that are within our confidence interval for the zero/empty state and only consider the "dirty" cells?

    Is there a fade-in across level boundaries or is that just TAP?
     
  5. No, there isn't one.
     
  6. Muf

    Muf


    Exactly. Instead of just having a block threshold (and getting confused between a dark blue locked J-piece and the background), you'd first have a background threshold, that is whether or not the pixel differs enough from the background to be considered for the block decision. With subtraction any value that isn't black should be the stack.
     
  7. These are good ideas but to an extent it's living in a fantasy world where you have roughly pixel perfect video to work from. Pier21 YouTube uploads are... somewhat far from this ideal, to put it mildly. I'm aiming to solve the general problem without manual fiddling as much as possible.
     
  8. Muf

    Muf

    I doubt you can get accurate replays out of something like Pier21 YouTube videos as at best they will be decimated to 30fps, at worst they will be decimated to 30fps, have approximately one dropped frame per second (to account for the 60/62fps discrepancy), and will have two interlaced fields blended together. You could get a rough fumen out of it but you'll have to make a lot of assumptions about how the game was played (and poor quality video frames to base those assumptions on) to generate a working replay.
     
  9. I guess we can have different replay capturing programs, as long as the replay file format is standardised.
     
  10. Awesome project. I'll be happy to help whenever I have some spare time.

    As a first step I made an addition to steadshot's script that lets the user select the playfield directly from within the script (link). We can later replace this by some automatic algorithm.

    It should be simple to extend the script to processing a whole video, but we should probably make it work better for tgm1 first. I'll have a stab at that next.
     
  11. I feel like it might be possible to interpolate at least some of the information from the decimated frames. I mean, you have a timer at the bottom of the screen which you can OCR to determine which frame you're looking at (assuming the rendered time is reasonably accurate, and updated every frame). You can use that, along with information from the previous and next frames to make some determinations about what likely happened in between. The most challenging part is that you may also have two frames interlaced and blended together. You would need to be able to detect when that happens, so that you can make some adjustments to what you're looking for.

    Background subtraction sounds kind of interesting to me, but I feel like it may have some pitfalls which aren't immediately obvious.

    IMO, the meat of this is generating a graph from an image of a playfield, so you can analyze it, and that part is actually really easy to do, if you have points representing the 4 corners of the playfield, and know the dimensions of it (10x20 is most cases). You don't even need a high quality video to do this. All you really need to do is calculate the center-point of each block in the grid, and look at the intensity of the luma channel, and possibly what color it is. You can look at a histogram of the playfield, to automatically tune the thresholds used to make these determinations. The harder problem is locating the playfield with reasonable accuracy, but this part can be stubbed in on a per-video bases until it is solved.

    To solve locating the playfield, I would consider using something like HoughLines, Key Point Detection, or some combination of the two, along with some assumptions about where the playfield should be located. You're basically locating regions of interest, for things that look like large 1:2 rectangle's, and then looking at those regions in more detail to figure out if it's actually the playfield. If the CV part of this is fronted by a GUI, then you can have a human manually intervene to correct any errors in the playfield detection.

    I kind of see three parts to such a system. The first part would be extracting information from videos, into a common file format. This format would contain information (frame number, timestamp, graph of the playfield, maybe some other bits) for each frame in a complete game. Once generated, you could do some post-processing by analyzing multiple frames to fill in any gaps (interpolation), and do some preliminary calculations (such as tracking the "active" piece being played). Finally, individuals can write their own scripts to analyze the final data, to try and look for patterns that they may be interested in, generate replays, fumen graphs, or maybe use it as training data for machine learning projects.
     
    Last edited: 5 Nov 2016
    d4nin3u and colour_thief like this.
  12. Sorry I'm only just commenting on this now, I've had quite a busy period teaching, it's exam period now and all I'm going to have is marking and doing my own masters work, so hopefully I can contribute a little bit to the algorithmic side of things at the very least, I may not have time to do much coding though, sorry.

    Without much though, I'll through out some ideas I've had while reading this thread.

    Play field detection: It's probably easiest not to do this early, especially for the superplayers of TGM. There is almost certainly a moment where you'll have 9 columns almost full. It's probably pretty easy to detect the difference (in terms of a subtraction) between frame 1 and maybe 20 seconds in to find most of the 10×20 box after knowing where 150 or so of those squares have changed. It's probably less likely to be able to detect this with new players, having a desire to clear down, and Swiss cheese the stack, but even then, a large portion of squares will be different.

    30fps issues: I'm not sure how to solve this issue at all. There has to be some kind of interpolation I think, which isn't particularly nice for getting accurate results.

    Piece detection: I'm not sure we actually need to apply a Gaussian blur to anything assuming we can get an accurate play field detection. If we know "roughly" (within a few pixels) where the play field is, then we can just average the colour and take the colour closest to one of the seven colours for each square over a sufficiently large subsquare. This may cause issues with the L and I pieces though, as they are similar colours.

    APM and piece counters: Although these would both be hooks, I think they are important. I remember a thread when I very first started playing here that was saying something about the west would be faster if we reduced our number of inputs, even just a little bit. So we could verify that conjecture, I suspect pieces per second / inputs per second will be higher in faster players, and will probably have an interesting distribution if we look at all skill levels.

    Subtracting the background: This may create issues in that it may cause pieces to transform to roughly the same colour, again the L and I pieces will be the most annoying I think. Besides that, I do like this idea a lot! As for how to actually do the subtraction on a low quality video, we know what the values of the background should be, so we could find out how shifted the video is from being perfectly centred by doing a least squares type adjustment. Essentially, offset based on what gives the minimum error, then pray it's correct. The error may be hard to detect though, maybe making the background binary (black or white) and doing the same thing to the video and doing a least squares type adjustment on that will be sufficient. I'm not sure.

    Anyway, take all of these ideas with a grain of salt. It's early for me, it's also a Sunday, so I don't want to think too hard. Hopefully they spark a better idea in someone else though, and they can stew in my brain where my subconscious can attack them.
     
    d4nin3u and colour_thief like this.
  13. First of all, very cool project idea. Very ambitious. I don't doubt colour thief's skill. He has helped me before when I didn't have answers to a problem. Even looking at the date of this thread, and no one has said "it can't be done" yet, well it must be positive, right?

    Just take the task of manually doing this for one video (a human detecting the video and converting it into a replay format by hand or some simple tool). Maybe even have multiple people doing the same work, and then compare the outcome to find different errors. If the comparison of that work cannot be determined by a computer, then, could a computer be able to do it all by itself?

    A video, especially a Youtube converted video has a low frame rate (< 30 fps). If that video was interlaced, the frames might be blurred/merged together in a very dirty way where you can't see which move was made or which frame came first or belongs to said frame. Ambiguity is high.

    Frame skips in a video will happen, and input has to be guessed: If the skip is followed by a merged interlaced frame, let's say the level counter says 237 and 238 at the same time, but the piece is clearly at the same location, was there any input applied during that lost time, any DAS, and what real game frame was the video on?

    Image detection, similar to voice recognition or handwrite recognition, is still not at a point where we can predict an accurate result. A full replay can only be 100% if there was no errors made during detection. The survival of this project would assume a full replay should be generated without a single error. Even a replay with about 3 errors would be a tedious task to correct manually.

    I have spent a lot of time in the past trying to convert analog pictures to digital by hand, with 0 success. Even knowing colors and pixel ratios ahead of time, the issue comes down to it being too many factors, the most prominent one being the data not being lossless. How many annoying hours in my life have I hated things not being lossless (looking at jpegs and mp3s with pure anger and hate).

    But one thing that I like about this idea is to be able to take a crude piece of something, and turn it into a digital sequence to always be able to count on. Even if it is just finding the randomizer sequence (basically, the starting seed) is a satisfying task to be able to solve.


    When most people come up with a project or an idea of a project, it is usually AI related. I have seen people come and go, but still no AI has been developed. Even if it is a sub category of finding a replay, having an AI being able to play TGM is still very interesting to me, still I haven't even found the time to figure out how to work out an AI, something I know you have the skills to do, so that alone gives me very little hope to help in your project colour thief :(
     
    colour_thief likes this.
  14. Zaphod77

    Zaphod77 Resident Misinformer

    I believe poochybot was capable of playing Grade Mania 3 and stacking for tetrises, but i could be misremembering.
     

Share This Page