I have lots of ideas, and no time to implement most of them. And so I decided it would be fun to begin airing some of them out. They certainly will not do anybody any good in my head and perhaps some of them might catch on.
Today’s thoughts are again around user interface. A while ago I wrote about UI and the problems I see with computer interfaces versus plain old paper. Since then I have been thinking more about why the computer screen in many respects feels like such a poor user interface, when compared to paper, or more broadly, the real physical world.
I have come to the conclusion that a big part of the issue is the speed with which it is possible to “pan and scan” on the computer vs what you can do in the physical world.
With a newspaper I can quickly scan all the headlines and even quickly flip between pages to scan them. In so doing I am taking in massive amounts of information and making decisions about where I want to go.
In the real physical world, I can, in less than a second, look out of my window and very quickly determine if there is anything I should take note of.
On the computer these things are much harder. Our rate of interaction is *much* slower. To get to the heart of the matter, the best way to break down the problem is to think about two issues: zoom, and as mentioned above, pan and scan.
Zoom is the ability to step back and look at the big picture, or to step forward and focus on one specific thing in greater detail. This is a very difficult thing to do on the computer and particularly the web. We have organized information on the web in terms of pages, which is fine, but the pages are generally rather small, and changing pages is painfully slow. I want to be able to “zoom” as quickly on the computer, and the web as I do in physical world with my eyes. I want to move instantly between seeing an overview of lots of information and focusing on one specific thing.
Pan and scan goes hand in hand with zooming. Because when you are zoomed in you can’t see very much and so you need to be able to very quickly look around in a space. Again in the real world, our eyes are exceedingly good at this. But on the computer, scrolling is very slow and difficult.
Interestingly, the iPhone has highlighted the importance of zoom and pan and scan. They have implemented zooming through their new “pinch” feature which allows you to touch the screen with two fingers and pinch the display to zoom out and to do the reverse action to zoom in. Similarly, and more familiarly, you can drag your finger across the display to move around.
The popularity of these features speaks to the incredible importance of both zoom and pan and scan. And while they are great advancements in user interface, they are still, to me, not sufficiently fluid.
The physical friction of the finger against the glass with both actions makes them feel more rigid. And the speed of their operation while impressive relative to other interfaces does not really compare to our ability to zoom and pan and scan with our eyes.
Last time I wrote about this issue, I discussed the virtual reality glasses, popularized as a concept back in the early 80s, and how they were used to allow one to pan and scan by turning ones head. And while one day that might be an interesting advancement in mainstream computer interface, I don’t see people wearing VR glasses in the near future to browse the web or work on a document.
Nevertheless, the fluidity of the VR glasses experience I believe can and should be implemented on mainstream computing devices – just without the headsets.
The idea is fairly straightforward.
Imagine a computer screen that could sense the distance of your finger from it, and where your finger was in relation to the screen. For example, it could tell if your finger was 2 inches or 3 inches away from the screen and it could tell if you were in the upper left hand corner or the bottom right, or anywhere in the middle. This would be a kind of 3D touch-less touch screen.
Now imagine that this interface allowed you to control zoom and pan and scan. The closer your finger gets to the screen, the more zoomed in it gets to a particular area. The further away it gets, the more zoomed out it gets. Moving left or right would be the equivalent of scrolling left or right. By not physically touching anything, the interaction is frictionless, and therefore much more fluid, allowing it to integrate much more natually with the way the eyes work. Of course the key to this is that all zooming be instantaneous. Of course with the graphics CPUs we have, this has been long ago possible, but given the physical interfaces we have been working with, it hasn’t been particularly important.
Another challenge is that in order for this to work well, software would have to be redesigned to take advantage of the interface. The good news is that I believe an optimized browser could solve most or much of the software problem. In this new paradigm, what you’d really want is to be able to scan around a large set of pages very quickly to see what is interesting. A new type of web browser could pre-fetch and render the surrounding pages for any given page. These pre-fetched pages would be laid out on one giant virtual surface, which would give you the ability to scan around a large number of pages very quickly.
For example, imagine the New York Times rendered in this way. When you go to the home page, it renders the home page, but also all of the pages that home page is linked to, and they are laid out in some yet to be determined but well organized fashion on this large virtual surface.
By positioning your finger, perhaps 6 inches from the screen you can see all of the pages on the virtual surface, but in thumbnail form. As you move in closer, things start to become just barely readable, and you can start to make out pictures. You then start panning around for what headlines and pictures look interesting. You find something interesting, and so you zoom in (move your finger) closer. You then lock the screen there, perhaps with a button on the keyboard accessible with your other hand, and you begin to read the page.
To me, this feels like a much more intuitive way to read the New York Times than the way I currently do it online, and it might even compete with the way the physical paper feels. But the most interesting aspect to me of this is that I think it would make big bold advertising more effective, more acceptable, and perhaps even actually pleasurable, in the same way many full page print ads are today. In other words, I think this not only represents a better web user experience, but a better monetization platform than today’s web based display advertising.
Of course there is a lot to think about here, and I am sure there is already some paper somewhere by some UX guy who has proposed something similar. After all there are no new ideas.
That said, I strongly suspect no one major is actually putting anything like this together, particularly combined with the re-conceptualization of the browser, which I think could be key to fixing the web’s display ad problem.
And so I’d love to get a conversation about this going and hear what people think. I know there would be lots of things to get right here including things like a need for hysteresis to compensate for jittery hands. And I am sure there are many more details. But what I am most curious about it whether people actually agree that this would be a better way to explore information on a computer.
Monday, July 28, 2008
Subscribe to:
Post Comments (Atom)

33 comments:
You can do this (sort of) on a mac by zooming into the screen. I think it's either Apple+ or cntl+.
hmm... all my screen did was zoom the browser window.
But then again the point is all about the physical interface and the speed. Given that I don't think pinching is good enough (which is implemented on the MacBook Air), you can imagine that I don't think a keyboard would do the trick either.
It's great that you are posting ideas you don't have time to implement. It's stimulating to see the creative process/products of someone smart.
I agree that the reading interface is pretty bad with computers and I agree with your assessment of what is wrong. How about using a pad flat on the surface next to your computer (or even attached to your keyboard -- such as the trackpad on a laptop)? That way you wouldn't have to have a finger above the screen all the time. That would get tiring for large screens. Plus people might not like to read with their finger.
The pad could sense your finger(s) location(s) and put an unobtrusive icon on the screen indicating where your finger(s) was in relation to the screen. But maybe the icon wouldn't be needed with experience. I believe there are no-touch keyboards for people with carpal tunnel syndrome that support specific finger motions. I wonder if that technology could be used.
Prefetching, large screens and an easy interface for navigating the desktop will make the Internet much more enjoyable and productive.
Travis,
Thanks. I am trying to understand your idea a bit more. What is the difference between what you are describing and a track pad? Are you suggesting two track pads on a computer? And if so, how do you handle the third dimension i.e. the zoom as opposed to the pan?
You really need to read up on Edward Tufte. His labels for what you discussed are: information stacking, small multiples, and information resolution.
http://www.edwardtufte.com/tufte/books_vdqi
Thanks Brent,
but it is unlikely I will be reading his book soon. My plate is too full... which is part of the reason I am writing a blog about this stuff rather than trying to do it myself. But Perhaps you can educate us by summarizing what he says on this subject. Does he agree? Does he suggest the same product concepts? Does he reference anything interesting that anyone has implemented? How close are his ideas to mine? Any contribution to the discussion would be greatly appreciated.
"...in less than a second, look out of my window and very quickly determine if there is anything I should take note of."
"the ability to step back and look at the big picture"
The word you're looking for is "gestalt".
"These pre-fetched pages would be laid out on one giant virtual surface, which would give you the ability to scan around a large number of pages very quickly."
It's eerie. It's like you watched the TED photosynth video and now are trying to take credit for coming up with the idea.
http://www.ted.com/index.php/talks/blaise_aguera_y_arcas_demos_photosynth.html
Sure, absolutely.
We do want to interact with the computer in a way that's much, much closer to how we interact with things in reality. I would go as far as saying that I want to be even faster. In the physical world it takes me 5-10 minutes to find the photo I'm looking for, even if they're organized neatly in albums. On the computer I want to spend 5-10 seconds.
I absolutely agree, that the next revolution must be about speed. It is way too slow now.
I am using a 2560x1600, plus an 1600x1200 monitor in portrait mode. I am planning to buy a third portrait monitor. I want to surround myself completely with as much information as is available for display using today's technology.
I am using it, and I am a lot more productive since I have it. I can see two A4 pages, plus the software menus without any zooming in and out and "pan and scan". I even also have another document open on the other monitor. For example my inbox. Because it's 1600 pixels high, I don't have to scroll constantly. Most emails fit on screen. I can just drag parts of them into the document I'm working on the big center screen.
Before I bought these monitors, I spent most of my time with:
- minimizing screens
- maximizing screens
- moving screens
- opening / closing menus, software UI elements
Now I spend most of my time with actual work.
Someone, who hasn't used multiple displays will think it's not important. But I haven't seen any serious user, who tried multiple monitors, and wanted / was able to go back to a single monitor. After you got rid of all that openingclosingzoominginoutandscrolling all the time, you want to do it no more.
Problem is exactly what you are saying. Speed, and the way I have to interact with the massive amount of information I store on my box and browse on the internet.
This display I have, with the GPU I have could provide a 10x better experience.
I can display 4 + 2 megapixels of information at the same time. When I buy the third monitor, I'll face 8 megapixels. It is already awesome, really.
What I really miss from this point is the speed I can browse through photos in the the physical world. I have a directory in which I store the stock photos I buy. I organize them in several folders, but still, each folder contains several hundreds of photos. It takes a long-long time until I can find what I'm looking for. The display is good enough, the GPU is fast enough. It's the mouse, and the simple 2D scrolling that's in the way.
When there was only the keyboard, only nerds (including me) were interested.
I remember the time, when I first READ about the mouse! :) I just could not imagine what it could be like to work with that thing! It sounded like such an extremely abstract concept. Then I bought my first Amiga, and very quickly became obsessed with it. It turned out, that using a computer can be quite an experience. I really enjoyed the mouse, and GUI. Back in that time, the computer had a monochrome VGA screen and a keyboard. I was amazed by the quality of animated graphics, CD quality stereo sound and the mouse, the GUI.
And I still am. I spend more money on my keyboard and mouse than most people spend on their whole box. Not to mention the displays... The way I interact with it is *very* important for me.
And it's not only me. Just think about the success of Wii. It's pretty low-tech in terms of graphics, but it's new way of interaction made it the most successful console today. I immediately knew I will buy one when I saw the first Wii video, and I did. And I wasn't down, it' pretty cool. But I think, it must only be the beginning. They should take it further from here.
I couldn't help thinking about the Minority Report movie, reading your post.
I agree, I will never wear a helmet or VR glasses. I want my 8 megapixel display in my face.
But as for the interaction with the surface, maybe it should not be built into the displays. I would much prefer not gloves, but a set of sensors I can quickly wear when I sit down for some serious work. It must be lightweight, and should not cover my hands, the tips of my fingers. I still want to be able to use the keyboard, or even the mouse.
It should be a set of lightweight rings, that I could wear on my fingers on one or both hands.
I want to be able to use a lot of hand gestures. For what? Ultrafast zoom, pan and scan, rotation, multitouch style scaling, positioning, order sorting. I imagine interacting with the GUI very much like Tom Cruise did in the movie.
Who should do it? Logitech. And / or similar companies. It should be a device, similar in concept to what the mouse is today. A new generic input device. It's device driver could map gestures to certain OS GUI actions. This of course would be very limited good, so maybe only a few early adopters (including me) would start using it. As it became more and more popular, more of its features could be built into new versions of the operating systems and special software.
So, at the first level, it could be a (good) mouse replacement. I imagine, that now, when I finish my comment, I wouldn't have to lift my right hand from the keyboard, but just lift a finger, point it into the air, move the pointer to the "Post Comment" button, and do a push gesture.
Maybe the company that would produce the first implementation, should go further than just a mouse replacement in the first place. Maybe they should partner with a browser (Firefox?) to implement a special version that would work very much like you described. But with gestures I would go further. With one gesture I want to flip through pages. Even faster than in the physical world! I want it to be blazing fast. Turning a physical page could take half a second. I don't even want a nice 3D pageflip effect to simulate what's in the real world. I don't want to waste my time on that. What I want, is to switch to the next page within a few milliseconds, with just one fast finger gesture.
Another gesture would zoom in/out on the thumbnail view. A pointing gesture would zoom to the document I was pointing at. If it didn't detect precisely what document I meant, it should zoom into a larger region, but certainly to that direction. Then, with just another pointing gesture I would have the right document selected. If the pointing gesture is fast, it should zoom in fast, if it's slow, it should zoom in slowly, so that I get more and more detail from the thumbnails.
Then it software vendors could invent clever gestures for different application specific tasks. In graphics design we very often arranging layer orders. Push stuff back, bring thing forward. It involves a lot of clicking around. I can imagine opening my fingers, and physicaly push a layer backwards.
Same for rotation, and scaling. It takes a lot of mousing to define the correct position, scale and rotation of a given object on a page. 10-15 different movements and clicks. With this input device I could put it there, and while I'm moving it, I would also scale and rotate it. Whoosh, there. Pick next, whoosh, there. Ready.
Let's hope Logitech are reading it. They'll certainly have one customer. :-)
Wait, you claim to have been working in this field for how long and you don't know Tufte?
it has to move as fast as thought and until it does ...
Milio,
One of the great things about living on this earth is knowing that you cannot know everything. I certainly don't claim to know everything about anything, which is why I embrace discussion and learning, and why I open the things I am thinking about up for discussion.
That said, one advantage I *do* appear to have over you is the ability to read.
Specifically, I said I was sure that someone somewhere has already thought this up. That said, the example you provide overlaps some but by no means entirely with what I am talking about since I am primarily addressing physical issues, and browser integration, either of which are discussed in the ted talk.
Tufte, read it. His work is foundational knowledge to your interests. Get some credibility back because I think you've lost it with a lot of people.
Anonymous,
What a dumb statement. Its funny how whenever people want to throw a discussion off course, they throw in academic source X and establish that as the basis for the discussion.
I dont think this is the case with Brent, though it often is, and clearly you, whoever you are, fall into that unfortunate category.
I would surmise that you cant talk about the subject at hand and clearly have nothing to say. Amusingly you as most people like you, always fall into the category of the anonymous.
Now first of all, I am not here to "establish credibility" with anyone. I do my work, and either it has merit or it does not. If something doesn't make sense, if you had any intellect at all you would disagree directly.
With regards to my interests I think you will find, if you read this blog regularly, which obviously you do not, that my interests are quite broad. And I promise you there are many thousands more books on economics, software design, database design, interface design, algorithm design, etc that I havent read.
So for all of you who have such a litmus test, please leave now. I promise you whatever book it is that you feel is important, in whatever subject I happen to be writing about,odds are I probably haven't read it.
So it would be really interesting to get back to the subject at hand. Fascinatingly, it appears you have nothing to add to the actual discussion, which in fact, is simple enough, I suspect, for most of us to understand without reading any further academic writing.
coda: It is just amazing to me how many people there are that really have nothing useful to say.
Tufte admirers,
Use your knowledge to contribute ideas about a better interface.
Hank,
My idea was to take your suggested gesture commands (zoom, scroll, pan) and use them with a touchpad-like device or some pad next to the keyboard rather than the screen itself. You wouldn't *touch* the pad but the pad would recognize hand movement exactly as you suggested, but you wouldn't have to suspend your hand next to the screen.
Ben does a good job of expanding on what is needed. So the question is what sort of interface. He suggested rings. I'm thinking a new type of no-touch pad. See this (now defunct) company:
http://www.fingerworks.com/
NYTimes article from a while back:
http://query.nytimes.com/gst/fullpage.html?res=9D0DE4DE113BF937A15752C0A9649C8B63
The keyboard recognized hand gestures without touching. That is exactly what we need but combined with better software interfaces. You could have software generate almost-clear fingers on the screen for orientation and movement cues.
Travis,
This is interesting. I get it now, and I think you might be on to something. But I dont think you need rings or anything. I thing it could work with cameras. I think relying on gestures though would be bad. In other words I think it would need more like a mouse, which tracks position, than a gesture based thing that tracked movement. I would also be concerned by the idea of rings, or anything that was not focused on my finger tips since that is certainly where all my fine motor control is.
But I think I agree that something on a surface would be better than something towards the screen. Although when I think of hand-held devices like the iphone, where this kind of think could be *really* useful, I start thinking screen again.
Take a look at www.nestedguis.com. (especially look at the videos) A very innovative concept for presentation.
"the example you provide overlaps some but by no means entirely with what I am talking about since I am primarily addressing physical issues, and browser integration, either of which are discussed in the ted talk."
The browser integration of the Seadragon/Photosynth tech demo is available as the Deep Zoom component of Silverlight 2.
That demo was based on the multitouch panel interface, but if you want 3D positional just Google for any of the Wii hacks.
So the hard work has been done, now get crackin' so we can see tangible examples of the idea!
Hank, it would be very useful for a desktop too...
I didn't like the screen idea, because I don't need new screen, but a new input device / GUI concept. I want to choose the display that suits my needs, and just add the new dream input device to my existing system. Or any system really.
Rings however was just a poor description. What I really wanted to say, is that I would feel uncomfortable about wearing full gloves. I would if I had to, but would prefer just a set of sensors, that can precisely track the *movement* of all (or some of) my fingers. If I don't have to wear anything (pad, cameras) the better. Maybe instead of "rings" it's just a few "points" that they use in 3D motion tracking. You are right, fingertips really are very important. Still, I want to be able to type too, with a keyboard, while wearing whatever I have to (if I have to).
Anyhow, it should definitely track motion too. I want to be able to move things, move the (mouse) pointer. Gestures are for added functionality. For flipping, turning, closing, opening, etc.
By gestures I mean, that I can open my palm, take a window (or any object really), and move, scale. rotate, throw, twist ... it at the same time. This would shorten 15 mouse actions into half a second. Whoosh, there.
By gesture I mean, that the GUI "understands" what exactly I want to do with that kind of finger / hand motion. Zoom in/out, pan, pageflip, or just move something. Let's say I have 100 page PDF file, that I view in fit page mode. I could use a "turn page gesture" with a finger. Bam, bam, bam - back and forth, very quickly. Another gesture would immediately take me to a thumbnail view of the whole document, and slowly closing my pointing finger towards a section of pages would slowly zoom into those pages.
This would just feel so natural. The input device should work like if I could "by magic" really touch and manipulate things physically on the screen (without effectively touching the screen though).
Travis, fingerworks was interesting. I don't like two things about it. One is that I have to (lightly) touch the surface, the other is that it's 2D. I cannot point *towards* and away from the screen. I think it would be important, that I could work *on* the screen, *with* the very objects on the screen. It would be much more difficult to work with an object on the screen, but physically moving my hands over another surface (like a pad). That's how we use a mouse, that's why it takes me altogether ~1 second to navigate the mouse pointer to the "Post Comment" button. Whereas I can point my pointing finger to it, lifting it from the keybord in no time. The gesture part is only that I also have to push it, when it becomes highlighted.
Now I have to lift my right hand, move it the mouse, and grab it. Then it takes me another second to just *find* the pointer on the screen. :) I have to move the mouse a little, so that I find the pointer's motion on the screen. Then, I have to move the mouse to move the pointer from its original position to the button, and then press the click button. All this could be shortened into a single pointing with my pointing finger, and a push motion. 1-2 seconds (or so) shortened into a single motion (fraction of the time).
It sounds ridiculous, when I talk about simplifying a single button click, but when we deal with thousands of such interactions per day, it adds up. A new interface could improve usability even much more than the mouse did when it was added to the keyboard only interface.
This is how I think it would be natural.
I agree that the most important thing is to be able to deal with a massive amount of visual information a lot faster, than today. Displaying a lot at the same time (a lot of megapixels), and being able to navigate *all the information much faster. The way we turn our head into another direction, and we see something else immediately. No mouse pointer movements and clicks. With finger gesture it's the same. Only we don't turn our heads, but move hand / fingers to see an other part of the GUI instantly.
Ben,
Great description. I'd buy the product immediately.
You could probably implement the 3D idea with increased pressure on the fingerworks pad. The normal touch is *very* light and might be as good as moving through the air. You are right that moving your fingers away from the keyboard is a pain.
The more I think about it, the better your ring idea sounds (or some obvious variant). The finger tips could be kept free and the bulk of the (hopefully very light) device could be on the back of the finger. One could use triangulation with some receivers that connect to a USB port. The receivers could attach to a keyboard or table and the hands are in the middle, but placement wouldn't be crucial. The user could customize the coverage area. Gestures could turn input on or off so that typing fingers wouldn't move GUIs on the screen.
Each ring would probably need its own small battery to transmit some frequency, although some type of RFID technology might be possible in the future.
All of this seems quite doable with existing technology.
Hank,
The Deep Zoom component of silverlight is pretty impressive. Is that close to what you are imagining, but be able to use it with web pages and gestures?
Travis,
Deep Zoom is not new. flash has been able to do everything there for years. But whether flash or silverlight, zooming is not a big deal. It is doing zooming in a generally useful context. When you can browse the web using deep zoom, and the software preloads pages and lets you pan and zoom around all the preloaded pages, then all that will be required is better interface hardware. But baring that, deep zoom/silverlight don't really help as far as what I am talking about. As I said in a prior comment, Adobe Air, which includes an HTML renderer plus Flash, would probably be the best and easiest tool to use to implement what I am talking about.
Hello Hank,
In a talk by Ben Fry at the See Conference (http://www.see-conference.com/) he showed an interface developed by John Underkoffler, who developed the gestural interface animation used in the film Minority Report. After the movie, John decided to try to make a real interface that works with gestures. The software is called G.Speak and the company he founded is Oblong Technologies.
Johnny Chung Lee has also done some interesting examples of finger tracking and head tracking using the wii remote to track up to 4 IR leds. There is a TED Talk by him demonstrating some of these. http://procrastineering.blogspot.com/
http://www.wiimoteproject.com/
Although I think that the comments above pointing you to Tufte's works were a bit abrupt, I think it is likely that you would find them interesting. They are self published so that he can control their layout and production values. They demonstrate his ideas visually - so they are beautiful to look at rather than dry academic manuals. http://www.edwardtufte.com/tufte/
Brenda,
Thanks for the links. I will definitely check them out.
As far as Tufte, perhaps I have been unclear. I am sure he is very smart and has done great work.
The reason I wrote this piece is (see first paragraph) I ***don't have time*** to pursue this stuff. I am doing a ***STARTUP***. I work continuously. There is an enormous amount of stuff that I am sure I would love to do, but reading books that will not help me launch my product are not part of the plan right now. Doing a startup is quite time consuming, and writing this blog takes a fair amount of time as well. It is just amazing to me that this is so hard to understand!!!!
Sometimes I feel like everyone reads every other word I write. Hmmm... what would that be like...
As cool as the Minority Report interface looked, I'm doubtful you wouldn't fall victim to the same "gorilla arms" effect that's stopped touchscreens ever taking off outside of the mobile world. I really don't want to have to lift my arms off the desk on any long term basis.
Hank,
I hope you don't let all the unsolicited reading advice prevent you from posting other ideas that you have ;)
I'm too busy trying to make Tufte-style GUIs at work right now to spew back what he says.
Honestly, his books are so well constructed that any other means of conveying his information is unsatisfactory.
pan and zoom on screen suupport ? hum : just 15 mn zooming and panning on google earth make me so sick I must take my eyes away from then screen for half an hour
Travis,
I wouldn't like the pad surface, because it's also just a mousepad. Only, without the mouse...
Seriously, what I think would be important is, that one could manipulate screen objects as physical objects. *Naturally.*
Using the mouse well isn't that easy. It's so much fun watching my mother's quest to reach the start menu with the pointer. She has to concentrate extremely, and very rarely succeeds.
But anyone can lift his/her pointer finger, and *show* an object on the screen. Ok, highlighted. Then, anyone knows naturally, how to "grab" an object (open five fingers) and "scale" it (move all your fingertips closer, or further away from each other) and "rotate it" (rotate your wrist), move, push or pull it. It takes hours until you explore how to achieve the same goals in a new graphics application, using its set of manipulation tools.
It would be a pain to switch typing / GUI interaction mode all the time. It must detect modes automatically, from the position, movement of your fingers. If neccessary, it could come with its own special keyboard, that would detect if a finger is close enough to a key = it's typing.
Paragraft,
Yeah, nobody wants a whole day fitness excersise, by using a computer. The best part of using a mouse is, that I move the mouse one inch, the pointer runs across the 40 inch screen. :)
The idea of a PC touchscreen sucks because of this (but it's ok for a phone). It doesn't mean though, that you couldn't use your fingers as a very sophisticated input device, having your wrists resting on the keyboard.
You just lift your right pointer finger slightly, *show* a window, and do a small "push down" gesture. Bam, that window is minimized. Now point to an application on the taskbar (highlighted), "push up" finger gesture, it's maximized. All this without having to lift up your wrist, and reach for the mouse.
No sex with the Thinkpad clit, no touchpad touching, no aerobic with a 40" touchscreen. You just command that screen object by natural finger gestures... You! (highlighted) Get the hell out of my screen! Just hold your middle finger alone and backwards up, and show it to the application you want to quit without saving.
Enough rambling, I guess I described my dream of a better UI in more than enough detail now. :)
Aggressive progress toward what you have described in this post would be accomplished via a camera which understands sign language. Backed, of course, by software which interprets sign language into instructions even legacy software can understand, and future software can exploit.
A new, flailing-armed breed of sign language, or one already used by the hearing impaired? That's not important. We can even have different languages, with users learning and switching to their preferred one, just like in the natural world.
A pleasant side effect of being prepared for multiple language sets would be the ability for people with varied injuries or disabilities to adopt a language most suited to them. It would downright suck to lose your ability to resize images in a bowling accident.
Hank's camera would be embedded in his monitor, easily able to observe his sign language from 6+ inches away. Ben's camera might be perched on a stand in front of his keyboard. Location is as unimportant as the question of which gestures (language set) to use. Open up both to the user's preference.
Floggy,
Agreed. It has to be a very good, high resolution camera, running at 100Hz or something. It should read subtle finger motion, such as really fine closing of a finger towards a screen object. But I guess that's not necessarily a problem any more. And maybe, for a precise, reliable, realtime operation, the pixeltracking magic should be hardware assisted. Either a modern GPU, or built-in the input device. And maybe for high precision work (such as graphics, 3D, CAD, other complex GUI operations, on air typing) one could wear a pair of black gloves made from a comfortable material, with high resolution lines and dots printed on it. This could help the input device much better track the 3D state of your hands, fingers.
Also, I agree, that there could be different languages. It is indeed a good idea to adapt it for use by the hearing impaired. They could become the fastest typists. :) One could customize any default, or record gestures for any application for specific tasks. The basic things however should be standard accross all platforms, in a similar way that mouses work (almost) the same way on any box. Pointing for example should be standard by default. Unless of course you teach your device to use pointing some other way.
Now, that we have all this sorted out, it's time that we go back to what poor Hank had to say, before we input device freaks took over his blogpost. We've been thinking about the input device, whereas that was only a part of his original post (and not even the most important part).
Correct me if I'm wrong, but the idea was basically, that we should be able to zoom in-out, and "pan and scan" on an infinitely huge surface. He described a browser on a first level of implementation, but it could also be deeper, all across the OS.
Really, that's where it becomes really interesting. It should really work like "looking around" in the real world. It could take around 5-10 seconds to check out 15 websites. Imagine this. You have 15 of your favorite websites displayed on 15 large enough monitors, so that they display their whole content (that is, without scrolling). 15 monitors side-by-side. As you turn your head, you have a sense of what's on. Then you can make a decision what you want to view closer within a few seconds.
Now, just loading and scrolling through these 15 sites would take several minutes using the browser and mouse we have today. With a better GUI I could raise my hand, do a panning motion, and look (pan) around on this huge surface. Closing in to what might be interesting, zooming out back again. Having this huge single surface could be much better, than the concept of pages. Pages were invented, because we can't physically handle a paper above a specific size. It soon becomes unpractical in the real world. But a virtual surface, that could be "zoompanandscanned" easily is much better than flipping through a lot of pages.
I think it would be even more useful with really complex applications, such as 3D platforms. Or let's just say Adobe Flash. It has so many windows, that you either keep opening-closing them, or you see nothing from the document you're working on. Buying enough monitors to have *all* they (visual) information handy, is a good workaround, but even this has its limits. You can't just surround yourself with more than 6-8 monitors... :)
So, the solution is, to keep an unlimited size of screen real-estate "handy". If it's as easy, natural and fast as turning your head to see more, and focusing your eyes on something to get more detail.
We have it today with the mouse, scrollable screens, tabs, with dockable panels etc. But it is, as Hank has put it, too inconvenient, and slow, and ineffective compared to reading the paper version of the New York Times.
So I agree, but I was only concerned with the right input device, because a better GUI will not work with a mouse. Without a perfectly new, much better input device all this "pan and scan" and zooming GUI would still not do the trick.
Hank,
I'm still thinking about this and came upon this TED talk which implies that things will be the interface.
http://www.ted.com/index.php/talks/kevin_kelly_on_the_next_5_000_days_of_the_web.html
Rob
Ben, found that camera; finger tracking basic sign language has already been figured out. All hail the mighty Johnny Lee!
http://www.dailymotion.com/relevance/search/finger%2Btracking/video/x3gr5u_control-the-wii-with-your-fingertip_videogames
As you described the user interface being more of a 3D space than annoying layers of 2D windows, another of Johnny's projects would probably be interesting.
http://www.dailymotion.com/relevance/search/head%2Btracking/video/x3ug1v_head-tracking-for-desktop-vr-displa_3d
Gotta love this guy. "It's liking taking the picture out of the frame. And then you're just left with the picture frame, like a window into another room." Misquoted, no doubt. Enjoy the trail of bread crumbs. Soon we'll have this on the desktop, and Hank's dream will be as real as... anything virtual can be.
Post a Comment