Comments on: Killer Game App? https://blog.learnlets.com/2008/09/killer-game-app/ Clark Quinn's learnings about learning Wed, 01 Oct 2008 00:08:41 +0000 hourly 1 By: Clark https://blog.learnlets.com/2008/09/killer-game-app/#comment-70461 Wed, 01 Oct 2008 00:08:41 +0000 http://blog.learnlets.com/wp/?p=383#comment-70461 Matt, hadn’t heard of Rasch theory, but it sounds like a more useful model than IRT (e.g. the wikipedia definition).

Yes, I really distinguish formative/summative by whether anyone but the learner sees the result.

It’s intriguing to think about VR, and the fact that people are ‘behaving’, but matching the context to particular measures and evaluating responses is a challenging task… Thanks for the comment!

]]>
By: Matt Barney https://blog.learnlets.com/2008/09/killer-game-app/#comment-70427 Sun, 28 Sep 2008 22:18:45 +0000 http://blog.learnlets.com/wp/?p=383#comment-70427 I think your insight about virtual reality-based assessment is spot on. In particular, consider the utility of unobtrusive VR-based pre-hire selection. In traditional “high stakes” personnel selection, it’s relatively obvious that one of several responses are expected. Whereas, in a Second-Live or Lively.com type of virtual world, the very choice (meta decision) itself becomes data. The sequence by which game or activity alternatives are selected, options within game-like assessments are all potential candidates for loading on a particular “ruler of interest” (confession, I’m more fond of Rasch measurement than other forms of IRT). It also gives a wholesale better way to detect faking/misrepresentation, through the use of Rasch approaches’ “misfit” (inlier and outlier fit statistics) on unlikely or distorted response patterns.

Unlike some of the other responses, though, I think “formative” and “summative” ways of thinking about assessment is outdated thinking. I acknowledge that measurement has different purposes, and the size of standard errors; or utility of feedback vary as a function of the specific application. Good. linear measures that capture the essence of a human phenomena is more about linear measures with sufficiently small errors; and good feedback than it is about the specific category of developmental or achievement testing. Rasch is particularly useful for the feedback aspect in either scenario; and I could imagine virtual assistants suggesting additional games or virtual worlds to visit that are more likely to assess dimensions with relatively little information (e.g. larger than useful SEs).

I’m particularly intrigued with the potential to simultaneously measure various latent constructs in virtual reality settings, because I suspect we’re much more likely to get more data, across longer periods of times than other traditional and tedious forms of assessment.

Thanks for the thoughtful post

]]>
By: Stephen Davies https://blog.learnlets.com/2008/09/killer-game-app/#comment-69853 Fri, 12 Sep 2008 17:17:14 +0000 http://blog.learnlets.com/wp/?p=383#comment-69853 The knowledge line was more for a descriptive demonstration rather than actual life example, I agree that games should focus on skills rather than knowledge (which is a well known discipline in itself). Maybe I should have used an example where we were looking to test 2 different sets of skills instead.

I agree that intuitively the learning and assessment should be wrapped together, in practice however when we are developing a testing rather than a training environment we do find that we have to apply significantly different design approaches depending on the requirements set forth by the client organization.

I think the point I was making in my comment about the psychometric field was in the APPROACH to testing design rather than the psychometric principles inherent in existing testing schemes. One big issue is going to be demonstrating the validity of high stakes assessment, and the credentialling field has gone a long way to put rigour into this. Reference ISO17024 – a standard for personnel certification organizations (in part ) on how how to setup and manage testing and examination schemes. In a multiple choice environment it is easy to show if there are any testing artifacts by using blind studies, how can we do the same thing in a gaming environment.? Organizations that require high stakes testing need this validation that the test is not flawed and so we do need to put a lot of thought into how we approach this.

I think the point I am (badly) trying to make is this. Up until now we can test a person by (usually) scoring them in some way. Think back to high school, it was just as important for the teachers in Math to look at your workings to see where you were going wrong as well as the score.

This is what I think games can do – provide an insight into the “workings” of how somebody does something. But for a valid robust assessment we need both, the score and the workings to maximize the value of the test.

So to your last point I really do agree that identifying the objective is key. But working with our clients the objectives do fall into 2 types of categories, training outcomes type objectives and assessment based objectves, this still leads me down the path of having to differentiate 2 types of games (believe me we have tried to integrate the approach).

A real example… we are doing games to train auditors – in these the coverage of the standard is key and how to engage with people is a critical competency that needs robust training and a games practice environment is beautifully suited to do this.

Another company needs to assess it’s auditors on how consistent they can audit to a particular standard with respect to corporate guidelines. Same subject as the game above but the objective is very different and the two products, whilst having some similar mechanisms have a very different framework.

Ok not sure if I have covered off all of your points, brain is a bit fuzzy at the moment. Will re-read and get back to anything I have missed.

Regards

Steve

]]>
By: Clark https://blog.learnlets.com/2008/09/killer-game-app/#comment-69847 Fri, 12 Sep 2008 15:59:45 +0000 http://blog.learnlets.com/wp/?p=383#comment-69847 Stephen, I agree that very carefully specific competency definitions (learning objectives) are critical (though, quibble, I wouldn’t make them about knowledge, but about skills). However, I think you’re confounding two things: high-stakes assessment is summative: what can they do. It’s not formative or diagnostic. If you want to be diagnostic, then, yes, you need to have a more granular assessment. Though it’s probably only one: can they do X correctly? If they can, then we know it’s the speed issue. The valid scores have to do with your validation of the assessment: independent tests, or going all the way to Item-Response Theory (shudder). Yes, it’s seriously engaged with psychometrics.

My point on training/game design is that you should identify the objective, and design/align your assessment to it. Then you design your learning to achieve it. Or, in the case of a game, design it to embed the assessment in a way that aligns all the elements of meaningful application, meaningful to player, balanced challenge, etc. IF the organization wants to combine those two performance objectives (and that would be in their Mager-style assessment: correctly perform x number of A performance within y seconds). You might then have two sub objectives: first, correctly perform A, and then, perform A within time period y/x. However, I truly believe that games are educational practice (inherently assessment), and that the learning is wrapped around them. So I don’t see your distinction on training games versus assessment games (though you can embed concepts, examples, and maybe reflection within the game environment, but I suggest not to think of the game as THE training, but as practice that can motivate attention to the learning.

]]>
By: Stephen Davies https://blog.learnlets.com/2008/09/killer-game-app/#comment-69843 Fri, 12 Sep 2008 14:56:19 +0000 http://blog.learnlets.com/wp/?p=383#comment-69843 Clark

Completely agree about this being a huge contender for the next killer app for games but one comment is on the use of an existing game to assess a player. We have found that game design for assessment and for training products are distinctly different. Taking an existing game and “plugging in” some scoring does not necessarily provide the most robust assessment profile of the user.

When developing as assessment product we have learned that very specific definitions of the competency mastery levels are key, as well as designing the game to clearly differentiate between assessment of knowledge, skills and behaviours. For the results to be useful for an organization and the user these elements need to be pretty cleanly decoupled to ensure that there is no “masking” going on (ie unknown or unidentified interactions between these different elements).

A simple example could be that an organization wants to understand how much knowledge a player has about a specific subject and how quick their reactions are in solving a given problem about that subject (2 axis of testing).

If we test both simultaneously and the player “fails” – what caused that failure? Lack of knowledge or poor assessment skills? A specifically designed testing product on the other hand may have 3 elements, first a clean knowledge test, second a clean problem assessment test and third an environment where both are put together. This means that when a player fails it is now much easier to define how and why they failed. Do they not have the required knowledge, can they not apply assessment techniques or they are great at both but do not have the co-ordination skills to put them both together simultaneously?

My feeling is that most games today are developed for training and therefore these different components (or axis) are mixed together. This is in fact a GREAT thing for a training product as this interplay really makes life interesting for the player, but may mean (in my opinion) that generating valid, defensible scores would be tough.

… and this is exactly what the high stakes assessment guys need – valid and defensible scores. For the games space to move in this direction we need to really understand what the psychometricians do in the credentialling space and take some lessons from them. We then need to really sit down and look at our game designs from a new angle, not just a re-hash of existing training product.

Anyway my 2 cents worth on a great post, all opinions would be really appreciated on my (potentially) mad ramblings.

If anybody disagrees then I am up to debate over a beer at both Brandon Hall and Devlearn (Sep and Nov respectively), come find me at the DISTIL Interactive booth.

Regards

Stephen Davies
s.davies@distilinteractive.com

]]>