[+]Twitter Add to Google! Add to My Yahoo! Subscribe with Bloglines

Persistence: Why There Should Be No Cost to Minting Identifiers

08

June

  • Single registrars are a point of failure.
  • Registrars /must/ secure their identifiers from point of creation to point of use somehow to ensure integrity. This involves securing the root name servers and every name server in between to the client and/or adding another layer that secures the actual data.
  • On the other hand; key-based identifiers, managed by peers, signed and secured, freely distributable; do away with the middle man by placing trust in signatures - which is required in the secure single registrar system anyhow. Minting identifiers at peers mean these identifiers can be made persistent across multiple protocols and managed by the peers themselves who may choose to enlist the help of other peers. Mint one identifier for identity in a peer-to-peer system? No! Single point of failure remember. Mint multiple layered security level identifiers and map them to one another! Hash collisions happen over time as compute power increases, account for that with multiple levels of security.
    Now; registrars for human readable names can then be built on top of such a system. Additionally; peer-based naming reputation federation can occur. Petname systems for even greater end-point and en-user social security as well. With identity rights transferable via signing from one peer to another via a byzantine fault-tolerant voting process. Signing in such a way that history of ownership is passed on with each signing.

The real reason for no cost to mint identifiers?* Persistence. Peers can manage their identities! They own the identifiers, not some third-party. No cost for minting identifiers also has an added benefit for privacy and anonymity. Want to stay private: Mint a unique identifier per transaction.

Unfortunately; until IP addresses themselves(MAC addresses too) can be minted on the spot, nothing you ever share online is truly private: it can all be tracked back to these fixed identifiers by those with motive. Because any fixed point identifier can be zero’d in on: For true anonymity and privacy any peer in a network would need to be mobile and able to mask it’s identity by making their network hopping so complex that it would be infeasible to compute a pattern in a set time. Unfortunately the pattern itself could lead to identity. Instead a layer of indirection distributed among anonymous peers appears the only realistic way to obtain a level of anonymity. True privacy requires more effort.

Anonymity and privacy are not really the key I want to emphasize - rather; persistence with the user-driven identity as the real /key/ to identity.

Demand peer-to-peer key-based identity for you and me.

*Everything has a cost; persistence’s is management. The peers must still pay by providing or obtaining resources to manage their identity data, however they themselves have the first and last say: Family wills could pass on identifiers by signing them over to kin when people die…


Persistence is a Matter of Management and Openness

03

June

I originally wrote this as commentary to a question on the Tetherless World - Research Constellation, Future of the Web site.

XRI and the W3C have been going at it recently over this and a number of other issues. XRI uses persistent i-numbers or URNs and was recommended against by the W3C. See Dave Orchards post for details.

Personally, after thinking about this for a while. I agree with the W3C’s stance, while awaiting an XRI members counter argument.

One way I see the cost of managing persistent identifiers being solved, may be a tax(shriek!) but I don’t like that idea. How many persistent identifiers would that get me?

Any single point of contact, is a point of failure…

You’d think people and objects really only need one persistent identifier per object that travels with it through it’s lifetime. MAC and IPv6 addresses comes to mind here. However with this approach, privacy is a real issue to be dealt with.

Any robust system should be self-sustaining…

Peer-to-peer distributed naming is another option. Unfortunately hash collisions mean you move the management elsewhere into crypto systems on untrusted storage and compute nodes. The crypto systems aren’t the problem(maintaining them is), nor the untrusted storage(everything’s encrypted!). It’s the compute part performing the crypto and network in between nodes that is the problem. Hence the result of what we have now with DNS and centralised management & DNSSEC (which hasn’t taken off as I understand it). It’s the same problem for distributed processing.

I’m currently trying to come up with a better answer - if there is one - other than simple centralised management and paying for a domain every ten years (the current maximum lease period). I think a different peer-to-peer approach could be made to work here.

The bottom line is that storing anything persistently over generations is never easy and requires management and the placing of who to trust that management to.

Also; when it comes to data and systems retention(it’s the system your /really/ trying to persist not the data), it’s a very complicated problem that leads you down rabbit holes into persisting the hardware that the software uses to transform the data. Open hardware and software & licensing that affords Virtual Machine creation would be a positive step towards this in order to emulate older devices. Why I run Linux! From memory it’s illegal to emulate the ARM CPUs used in most mobile phones, let alone all their embedded peripherals. Mac OSX is also illegal to virtualise as far as I know, let alone the hardware it runs on. Neither Windows or OSX are open source.

I’m all for openness and letting the community maintain systems where vendors with limited resources come and go along with the services we rely upon.

Not only that but the lessons we can learn from history.

Everything we know now is the result of standing on the shoulders (knowledge) of giants. History is an important teaching tool, it’s easy to see how we quickly forget. Past knowledge - where feasible - should be made available for generations that come, in order to learn from both our mistakes and greatest achievements.

Great ideas are timeless, we should keep them that way!


Embedded Concurrency Orientated Programming - On The Road to Intelligence

01

June

I just watched Joe Armstrong’s excellent talk on Concurrency Orientated Programming from JAOO. Google it on InfoQ.

For a while now I’ve been thinking about implementing such a system in hardware itself. Here’s how it would go:

Embedded Concurrent Exokernel on FPGA = Embedded Concurrency Orientated Programming.

Embedded Concurrent Network Programming = Embedded Concurrency Orientated Programming Devices.

Self-Assembling Systems = Polymorphic Embedded Concurrency Orientated Network Programming Devices.

Weak Artificial Intelligence = Self-Assembling Systems.

Strong Artificial Intelligence = Self-Assembling and Sustaining Systems.

Artificial General Intelligence = Self-Replicated and Sustained Artificial Intelligence evolved over generations.

Think Exokernel as DNA, FPGA as The Environment and Devices as Cells.

Bootstrapping an Exokernel from primitive elements is the key to setting the process into motion. Then teaching ‘them’ (combinatorial elements or compound species) along the way by manipulating resources available in the environment and seeing who survives.

Primitive genetic algorithms just got more interesting to me. As did Maxwell’s equations, elemental physics, quantum mechanics and Craig Venters research.

Above all however; the medium for message transfer. Think electron transfer, think ‘magnetic’ fields, think waves, interference and supersymmetry.

:)


The Android Unlock Interface

01

June

It’s awesome, but what if you had unique icons that arranged differently each time you went to unlock it while your sequence stayed the same. People looking on may find it harder to follow your movements. Some simple “start” and “end” point markers is another option. Going further still… timing how long a finger was over node could add another metric. Enter the sequence too fast…or too slow… and no go.


The Why of XRI?

01

June

I’ve been reading of the XRI vote and kerfuffle with the W3C and thinking about persistence after my last post. As an onlooker a while ago I began wondering XRI why?

Why couldn’t an abstract URI or PURL eg. 3n265n2.name be used as the identifier like an i-number and registered for 10 years - along with another user-friendly named URI eg. craigoverend.name for locating something analogous to XRDS be done?

You can then have multiple names pointing to the one persistent URI. Persistence being a management issue just as paying for hosting to serve up the resources at the URI is a management issue. This is my understanding of why the W3C have their stance. That of managing i-numbers or persistent URNs should users stop paying for their i-name(s) to a point it becomes unsustainable for the registrar. How realistic the latter is, no idea, I’m no registrar expert.

Of course there’s more than just i-numbers, there are i-names that with abstractions for naming could be transformed with Joe Gregorio’s URI Templates(?) or some other standard.

If it makes interfacing easier, rules for example:

@company to be transformed into company.com

=person to be transformed into person.name

The W3C has also written about when to use metadata in URIs and when content-type is preferred. Having now read Roy’s thesis about REST, I can see why they may raise this with XRIs. While I don’t know much about XDI or the finer details of most of the specs, my understanding is they’re really an extension to the naming services and could probably be built on URIs.

Are there reasons other than those I’ve highlighted as to why identity couldn’t be done with existing infrastructure?

I’d love to hear your thoughts.


On Twitter Media, FriendFeed and Quasi-Decentralization

26

May

Duncan Riley writes it’s time for FriendFeed to Kill Twitter. I have to say I find his post both hilarious and ironic considering his exploits on one of the 2web podcasts that went something along the lines of; distaste for FriendFeed living off others content. Which it mostly does. :)

As for the data aspect he mentions; Mr otherwise-incoherent-today Gillmor’s live vs 10 minute remark hits the spot. The luxuries of job scheduling vs live stream processing.

These new FriendFeed rooms too remind me of Jaiku’s groups - that is, if they’re close to real-time, otherwise they’re just blog/forum comments - and we need /more/ of those… ahem.

Part of the pureness that is Twitter is the ability to explicitly opt-in. It’s why the community is still (relatively) nice[Poor Ariel]. Rooms get noisy. Rooms incite attacks and tribalism that spreads reducing the effectiveness of any service I’ve ever used. I’m sensitive, I /notice/ these things - gah! Tribalism is the new enemy as far as I’m concerned, and in my mind FriendFeed are already heading down that path into darkness. Someone mentioned ‘Troll Feed’ in Dunc’s comments; seems fitting.

As for additional functionality that FriendFeed brings: It’d actually be very simple to add one of them; Media to Twitter. Simply HTTP follow links people paste, with an HTTP HEAD request (after the url-shortening redirects) for content-type, then GET the known content-type media files in (jpg, mp3 etc) and display them in capable clients. Very simple. No existing tweet changes. 140 characters still. SMS compatible; clients only display the media if they can and user chooses to(embed flash to play the mp3?). I tweeted this idea /ages/ ago. Twitter or any Twitter client could add this tomorrow. Compare this to what I understand as FriendFeeds metadata approach(I’m too lazy to look at the API again to double-check that statement), replicating what’s already been done for us with HTTP, and I just see another architecture designed in a way I wouldn’t have.

ps. I read/heard somewhere Roy T Fielding the HTTP Spec Editor and REST guru mention “peer-to-peer” features for his alternative protocol coming known as Waka. No idea if it’s true, makes sense if it is… somewhere I did see a MONITOR method. So if people want quasi-decentralized Twitter, (the web ain’t decentralized) Waka may be the best place to start architecturally. He has put out a call for leaders in the Apache 3.0 project… XMPP (the pub/sub part) and the likes just feel stop-gap to me. In my mind Twitter is just begging to be HTTPized or I must be seeing things. Quite likely… it just obvious to me that resources monitoring other resources being told to retrieve updates at their leisure, is the next evolution of the web… or something, as well as other latency features as Roy’s presentation highlights… best done to augment existing HTTP. Rohit Khare has a nice document and a thesis(gah!) with regard to extending REST with a layered approach to pub/sub that’s a good starter for those interested. While we’re at it why not also see if we can use existing DNS infrastructure for a new tweetable @namespace?

IM Tracking is of course a difficult issue to handle and the reason according to Twitter for the recent meltdown. Put simply this could be solved with the introduction of real-time blog/tweet aggregation engines in a distributed manner. Services could even charge for the stream processing and live keyword tracking. Hey-ho, hey-ho, a-spam-and-keyword filtering-service building we go. Umm Akismet-on-steroids? But then this does bring up other issues I won’t go into here…

Otherwise… we can all keep having fun being architecture astronauts.

I know I’m having fun floating my ideas. Whee.

For me, the nextweb… she needs wait-less-ness, she does… yes-sir-ee - indoutably.

Wait-less-ness of space, and the scale that comes with it.


Online Identity - You’re Doing it Wrong

10

May

With the Internet Identity Workshop not far away, Alec Muffett writes of his distaste for parts of the current state of the Identity space. Hallelujah. I’ve been wanting to write this for a long time… now seems like as good a time as any to add my thoughts.

My takeaway of the moment is his advocating utilizing an entities(user/vendor) transaction history system as the authoritative trust and risk assessment mechanism. This isn’t new thinking, banks do this all the time. The problem however with the current identity space, is that to do this requires reliably storing *any* entities transaction history - which requires silo-free persistence, redundancy and management on all sides of the transaction in order to verify those claims. We don’t even have that silo-free persistence or redundancy yet - nor do we have the tools to manage our data that’s online and always available. Until we do, all this Identity talk has been bunk to me. Why I see it vital to be addressing this problem first.

He talks about relationships, links; they all require persistence or self-healing(think AI) in order to operate reliably as a system. Break one and any number of reliant services break unless caching is implied on the part of the relationship contract, or a human is there to fix every failure. It would be like losing your birth certificate, and having to go through the motions to restore your identity history to a verifiable 100 points before you could verify who you are in order to open a bank account, or create new relationships in the real world.

Now there are those I see aimed at fixing related networking architecture problems that partly solve persistence and redundancy: Van Jacobson’s Content-centric networking comes to mind, heavily laden with it’s key management system which has complexity drawbacks. Most of the other attempts out there however are all HTTP and REST-like based. One problem I see here is that we have network architecture not designed for todays identity, and one that criminals exploit at their own discretion with so many points of failure in the network that policing is near impossible. So how do you secure those routers and wireless networks your valuable personal private data traverses? You can’t. How can you be sure the chipset inside network devices hasn’t been tampered with by organised crime or opposing governments? You can’t. Or your personal network device? You can’t unless you can verify your device hasn’t been tampered with to the best of the manufacturers ability by checking an objects signature of operation with them /over time/. On the network side, only by signing and securing the data or hiding it completely can you get some peace of mind. Peace of mind with the knowledge that over time, that security will need to increase computational power as attacks do. There are already ISPs injecting information into web pages as they traverse networks, their proxies being easy targets for anyone wanting to be malicious. How do you make sure the data you ask for is the data you get? You can cryptographically sign it, or as many sites do now; open a ’secure’ pipe; HTTP SSL. A secure pipe over a network that could have hardware that’s been tampered with? That doesn’t seem very smart to me. What if you sign *all* the data? Then the points of failure become the crypto, end points, users, devices and the key management systems. What’s important however is knowing you get what you ask for, from who you ask for it from - efficiently such that such a system can scale. Problem is cryptographically signing and hash collisions mean you might not get what you asked for, instead a virus that goes undetected… which means we still need malicious software detection - virus scanners, etc - to monitor for malicious intent, checking packet signatures and detecting anomalies that slip through indicating the need to increase security. However, should malicious content breach a system, the best way to prevent that from causing untold damages, is by limiting what objects can do in a system - at all levels of the system.

Doing this presents still to solve issues on the end-points and user side in order to limit the social engineering of those wanting to be malicious. This has to be done by a means of focusing on user experience, education, and the interface by limiting damage and providing a support network for users should it occur. URI or email namespace, and HTTP resource handling as the interface are the real stumbling blocks here when it comes to adoption via user experience. You only need look at the Twitter namespace to see what’s possible otherwise, people often making requests for help to people they learn to trust as field experts over time - despite it’s inability to handle threaded verifiable transactions; something that could easily be added. Mobile and cross-platform devices really /are/ the future of transactions. What’s important about Twitter is that it’s /opt-in/. Capturing malicious claims openly in a persistent way can mean reputation based upon a persons history can become a reliable risk assessment tool. Don’t let the bad people in by default to access everything about you. Layers or capabilities are the key. There are also means that can make transactions between untrusted parties safer. Escrow and transaction risk assessment with repudiation should transactions go wrong being one - so long as the escrow is a trusted party of both parties…

With regard to those three equal parties Alec mentions participating in a transaction (user/vendor/IdP) - I agree that the third is not needed once relationships are in operation. I break the three down to one ‘component’ performing different functions as in a peer-to-peer or end-to-end network - which is one of the original IP protocol design goals[Bring on IPv6! (and a scalable replacement)]. However, most of the time it’s easier for people to conceptualize two(user/vendor or client/server), with the third managing relationships(the connector) - not storing anything - and instead delegating capabilities. I say this because in any classical network, there are three basic elements: Components, connectors and the data that traverses connectors between components. My problem with many existing Identity efforts, is the term Identity Provider or the ‘middle man’ aka single connector; as they just add a silo of failure. In my mind these must be replaced with what I call Relationship Managers. While the network itself is the ‘provider’; that being the hosts you choose - or preferably the network chooses for you by locality - to replicate and distribute your data redundantly between. All accomplished via Relationship Managers that I consider analogous to the VRM Control Panel I’ve seen Doc Searls advocate. Those Managers themselves able to delegate to other Managers such that authenticating an identity from multiple points of contact is possible. Important for redundancy and the ‘object capabilities’ security model: As multiple, yet strictly authorized devices for authentication via those Managers, is then possible on a per device/capabilities basis. Revoke one device(or Manager), still use another. A layered failure approach. A process no different than getting a new credit card(or bank). These Relationship Managers will need to be able to delegate auditing like a ‘Hawk’ monitoring a person’s transactions in order to detect anomalies. Something that could be done in-house by users or outsourced; Identity Brokers doing that management - just like we do now with virus scanners. This important persistent transactional data can then be used to tailor services to a persons user experience ala VRM. Everything can be packaged with the user in or out of control then, home-grown or outsourced; user-driven.

Whatever. Until decentralized data persistence, redundancy, namespace, and relationship management tools are here, it’s all bunk. There’s also another major part of the process yet solved; authentication. The current authentication arena makes me cringe. While I consider CardSpace one of the best of the bunch, it fails to follow the transaction history verification regime to learn and detect anomalies in operation, a process I consider important and could be layered on top over time. Key producing - dynamic, personal, biometric authentication learning systems that throw away the biometric data. In other words AI and unique object detection. Captcha’s are failing, it’s becoming increasingly more important to be able to detect that a human is really a human and differentiate them from other humans… something I believe can be done with multiple time varying history challenges, systems that learn like we do, if narrowly and task-centric at first. Recording passphrase fingerprints is an example of a step in the right direction. Still, like any other security measure it’s a moving balancing act, so I see CardSpace-like tools as a useful beginning in a layered object capabilities approach.

Still, with all these areas yet addressed and many neglected by what I see from the outside looking in on the Identity World: I still think we have a long way to go to when it comes to the ‘online’ world moving towards a service that lives up to it’s name before Alec’s hankering will subside for another that follows.


Building User Profiles by Data Mining Browser History Visited Links

09

February

Niall Kennedy has a post[1] on browser history visited link sniffing. By injecting popular links using JavaScript and checking css :visited, he’s able to track where people have been and customize the user experience to suit. It has privacy implications, I can see this being used to build up user profiles without consent and target all sorts of things like phishing and advertising. Having an opt-in system in place to provide this kind of data for sites to use, on a per site basis, could be an interesting use of this data though. If users could be presented with an opt-in option on sites to use this and store this information, it could be useful and possibly bypass privacy concerns, without the need to install anything. Doing so, user profiles could be built up over time, through data-mining those popular links and content then targeted at those users. It does however require data mining potentially popular links in the first place, but should you find a history match, crawling that site for more links and then matching those to users browser history could create a nice usage pattern to mine useful context from.

Apparently this is an old issue, going back to at least 2001. I worry that in all that time XMLHttpRequests will (and are) being used without consent to brute force test a users browser history for visited links, done so while hiding the bandwidth used as a movie or flash is played, etc.

Might be time to start clearing your browsers history, or getting this Firefox plugin[2] if your worried. :)

[1] http://www.niallkennedy.com/blog/2008/02/browser-history-sniff.html
[2] http://www.safehistory.com/


2008 - The Year of Con-currency

02

January

Predications are a needs based and what I’m seeing technologists need right now are tools to effectively handle concurrency, conmen and distribution of the wealth(personal information).

Right now everyone is talking about identity, open data, sharing of content across sites securely.
There’s aggravation for web developers with the lack of innovation going on; Ruby on Rails has hit the Gartner hypecurve’s trough of disillusion and browser manufacturers are baring the brunt of bored developers.
The only area I see keeping minds up at night involve concurrency. Reddit is riddled with it. Scaling to many cores, distributing that data and the privacy issues that arise with distribution. Identity and reputation again at the forefront. Wasn’t 2007 meant to be the year of online identity?
Media is as always the driving force. Viral video’s, viral devices and the sharing involved with anything viral. Twitter is set to lead the echo-chamber charge here. With the Qik-fix of live-video becoming Seesmic in proportions with all the hot air sending Kyte flying. All signs point to concurrency. Farming out of identical functions to apply to the many processes(and people) that make further processes light work. Soon those processes will reach astronomical proportions thanks to the Comet now orbiting the planet. It’s Publish-Subscribe prime time. HTTP and HTML is are once again getting a work-over with Waka due as a following revision.

2008 will be the year of concurrency.

Best of all *cough*, all this concurrency will bring people(through their social graphs) together into tribe like fashion once more. We’re seeing it now with ID cards and draconian ISP filter laws the Australian government is trying to enact which will only bolster the peer-to-peer revolution that follows into action. Again, all because of porn.


My Mobile Internet Device Form Factors

22

December

I’ve been thinking about form-factor classes to do with internet enabled devices and decided to come up with my own list.
Here goes:

  • Digital Audio Player*
  • Phone*
  • Pocketable (N810)**
  • Portable (eg. Everun/WiBrain)**
  • Microbook (HTC Shift)
  • eReader* (Kindle)
  • Minibook* (Eee PC 7-10″)
  • Tablet PC
  • Notebook*

*Gadgets I can use.
**Pocketable vs Portable is a hard one. A fast one with touch-screen friendly apps is important either way. The Quickly-look-something-up-around-the-house device. Calculator/TV Remote replacement. :)

This leaves me with the question; what do I want in a form-factor?

  • Digital Audio Player.
  • This must have some form of wireless capability with syncing to my online music/podcast store and be able to display album art. 5 day battery life. Light for jogging.

  • Phone.
  • GPS, excellent camera, recorder, wireless, - notes, todo, calls, texting, web surfing (full internet experience), simple apps. 3 day battery life.

  • Pocketable.
  • Do I even need a pocketable? You can’t read a book for long on something this size. They’re awkward in your pocket. They do have better input. Better internet experience for short usage patterns. Twitter and couch surfing friendly.

  • Portable
  • Two hand, no table, cramped space use; pocketable also does this well. Pocketable doesn’t have the better cursor navigation features or larger screen. How long you hold something this size or smaller dictates input usage patterns.

  • Microbook
  • Table top and lap friendly. Rest your arms. Too small keyboard for long stints. Short tumbler entry limiting. Potential travel-and-sit short-note taking text-entry/reader device.

  • eReader
  • Reading on the go. Must be wireless and have a good screen in all conditions. Grey scale or colour. (OLPC transition like) Light for holding in your hand for extended periods of time. Touch screen keyboard. Text entry comes secondary to reading.

  • Minibook
  • Travel bloggers laptop carry everywhere read and entry device. 6-8 hr working day battery life. Just big enough to touch type comfortably.

  • Tablet PC
  • Bigger more powerful version of the eReader, nuff said.

  • Notebook
  • Work horse do everything. 4+ hours battery life as your probably near a socket.

  • SLR Digital Camera
  • Chunky lens. :)

I haven’t mentioned using peripherals with these devices such as external keyboard and mice. Screen size dictates that trade-off. If you need a big keyboard then your gonna type a lot so need a big screen. Get a 7″ minibook+. Hmm. I really want a 9″ Eee PC and a proper eReader.


Older Entries

Recent Links

Recent Links

-->
Recent Comments
  • Craig Overend: Fixed, thanks Josh. English and explaining myself clearly has never been a strength of mine. Glad you...
  • Josh: Hey, just wanted to point out it should be "you're", as in "you are". Otherwise, wow - very in depth post....
  • Joe Andrieu: Craig, As I've mentioned elsewhere, user-driven is a solid improvement over user-centric, both...
  • Niall Kennedy: Asking the site visitor to opt-in would defeat the purpose in my particular case. I am trying to...
  • Craig Overend: Without qualifying yourself I find that comment facetious. If your playing on my use of the term...