    本版讨论Semantic Web(语义Web,语义网或语义万维网, Web 3.0)及相关理论,如:Ontology(本体,本体论), OWL(Web Ontology Langauge,Web本体语言), Description Logic(DL, 描述逻辑),RDFa,Ontology Engineering等。
    『 Semantic Web(语义Web)/描述逻辑/本体 』 → Google Base, 一个语义Web的原型系统?[Web2.0 vs. Semantic Web]

     Google Base, 一个语义Web的原型系统?[Web2.0 vs. Semantic Web] 
    发贴心情 Google Base, 一个语义Web的原型系统?[Web2.0 vs. Semantic Web] 


    The Google Base is seemed as a semantic content publish system with user-modifyable ontology. Very interesting!

    Google Base: http://base.google.com/

    Google Base与Semantic Web(语义网)的关系 http://blog.donews.com/sayonly/archive/2005/10/28/605465.aspx

    Google Base works around the lack of a Semantic Web
    "Google appears to be bypassing the need for users to individually publish their data (in RDF), by asking them to create their data records directly into the Google database. Data within the Google Base can therefore be explicitly related (if desired), negating the potentially laborious normalising/mapping of distributed semantic web data.

    Is this likely to evolve into a Semantic Web application? Are Google recording this data in an RDF data store? (I doubt it) Or will they expose a Semantic Web friendly API (e.g. SPARQL), or publish the data as RDF? It's unlikely that the data will be easily exposed as full recordsets - the value of the data is key to the application, to share it could be dangerous.

    Semantic Web is a dream; Semantic Web technology is 
    the reality.
    Weblog: http://blog.w3china.org/~orangebench/

    Google Base与Semantic Web(语义网)的关系 http://blog.donews.com/sayonly/archive/2005/10/28/605465.aspx

    Google Home Base
    -只说  sayonly.com                 english | other        创业生存手册

    本文试图通过一系列线索揭示Google Base与Semantic Web(语义网,以下简称SW)的关系,以此窥探Google在互联网服务的战略布局。当然本文属于创业生存手册系列,在系列的开篇中只说提到这个系列会提到web2.0,所以本文也会比较SW在web2.0的关系。本文引用的SW的资料大多数为英文资料,有识之士可以翻译并推介这部分材料,将是对于国内互联网整体水平的大的提升。
    Dedicated to another SW - Simon Willison。

    1,Google Base
    Google Base(应该是base.google.com,暂时无法访问)还没有发布,谣言已经满天飞了,从webleon的给出的链接看到,google的产品拓展经理Tom Oliveri列出了一份清单,给出了正式的解释(只说译):



    2,Google与Semantic Web的亲密接触
    几年前,Simon Willison发了一个简短的blog文章,对于google在作一些关于SW的研究而赞叹,他看到了一份以未来笔调描述google如何战胜Amazon和Ebay这些竞争对手的恢宏论文,作者是 Paul Ford。Simon Willison是一位很geek的程序员,我一直有看他的blog,虽然未必能完全看懂,他现在去了yahoo,有趣的是,它的名字的简写也是SW,把本篇文章献给他(其实应该是本章,但那样说也太失礼了)。
    Paul Ford那篇被多次提到(还有 Stuart)的文章讲的是,2009年,Google统治了互联网这个媒介,回顾如何击败Amazon和Ebay的历程,其实是一篇比较通俗的整体讲述什么是SW的文章,读起来颇为有趣。同样有趣的还有那个EPIC,当然就与SW无关了。
    其实美国东岸的几所学校对于SW的应用研究都很长时间了,最有成果的应该是piggy bank。

    2003年,google买了一家小公司,叫做Applied Semantic,应该用来做Google adsense的。因此有人写了一篇题为google在SW投资的文章,可以参看。

    google的搜索质量总监Peter Norvig今年初有一篇文章,题目叫做SW可以做什么,不能做什么是只说读到关于SW应用最透彻的文章之一,这系列文章很长,从各个方面探讨了SW应用和概念。Peter Norvig是个非常有眼光的人,我以前也是一直看他的网站,虽然至今他还没有blog,但是终于有RSS输出了。他有一篇传世文章,叫做十年学编程后来被很多人翻译过,其实这是他在NASA研究中心时候写的,呵呵,时间过去得真快。

    如今的Google Base的出现,必然有Norvig的眼光和推动力来成就这个网站。其实欧洲人比美国人更急于想实现SW,甚至已经有了semantic weblog,例如qlogger.com,但是没有人象norvig一样技术渗透,而且身后是google这样的公司。

    背靠着索引着最大互联网网页数量的google,在将网络爬虫使用到了极致之后,极有可能是第一个可能局部实现SW的商业机构,无论从技术还是从市场上看。当然SW是一种理想,至少google base让我们初尝到这种口味。


    3,什么是Semantic Web?
    什么是SW,就得先谈谈它的发明人Tim Berners-Lee,同时也是WWW的发明者。

    Tim Berners-Lee在近几年的报道提到互联网发展时(一般放到Future一页里面)无一例外的提到了SW,大约是发明WWW之后再发明不了其他玩意儿了,或者是其他玩意儿都没劲了。当然也还有其他的,5月的报告指出,目前网络在手机上面临的困境跟96年互联网在pc上面临的困境一样。当然,SW是对于整个互联网说的,跟接入的设备没有什么关系。专门关于SW的报告是题为SW在这里,列出了Nokia、HP、IBM等厂商的SW的进展,也可以在这里看到那次会议中谈论的细节,不过那里看不到那个SW在这里报道中的那个SW的形象图,画的是各种材料,包括砖头和木材,组合成的一头大象。形象地说明了在SW下,是各种可以识别的材料,组成了整个世界。many things to many people。只说喜欢他们另外一个宣传口号:Web Evolution causing a quiet revolution

    SW的核心意义在于网络内容是由多种可以识别的数据组成的,在早期的互联网,93年左右,互联网停留在文件形态,组成的是一个个文件,传送都是使用ftp 等工具;94年左右互联网处于文本的形式,出现了html和URI(唯一地址),可以通过这个地址进行访问;而不断演化,今后将在以XML等可以标记的数据结构中,而网页只是展示这些数据的一种工具,你可以通过任何其他的形式进行展示,甚至机器也可以识别。互联网不再是由一篇篇的文档和页面组成,而是由一部分一部分细碎的数据构成。

    这样说比较玄妙了,其实还可以解释得更简单一点。SW就是把原来的互联网内容,切成碎片,文章标题归文章标题,发布时间放到发布时间,文章概要归文章概要,分别存放,每一个部分都是机器可以识别的(当然实际可能更复杂一点)。在Paul Ford的2002年如何战胜Amazon和Ebay文中提到,它其实就是描述这些内容的另一种方式,这种方式下机器可以识别,具体方式虽然不是十分清晰,但是逻辑上,其实跟在你在学校里面学习的方式没有什么两样:

    4,Google怎么实现Semantic Web?
    Google究竟怎么实现SW,在Peter Norvig的文章SW可以做什么,不能做什么已经可以看出些端倪,Norvig在今年一月份(或者更早)都已经想好了应该怎么启动了,或者说,应该怎么逐步打造SW。他谈到了四个问题:
    这个问题只说要展开说一下,其实google并不是要建立一个Tim Berners-Lee等人理想中的SW,因为其实google其实只需要索引SW中的信息即可,因为如果SW建立起来,索引是一件简单的事情,甚至产品实现上面比google现在的搜索引擎更简单,技术要求更低。然而,问题就出来了,是先建立一个SW,然后来索引呢,还是先索引整个互联网,然后再生成把它放到有组织的SW里面去呢,这就是为什么google打造SW时遇到了先有鸡还是先有蛋的问题。
    那么只说的猜测是,目前Google base的作法是,目前互联网上的信息是很难组织,那么让用户提交有组织的信息到google,就能形成局部的SW。而这个局部的SW,就可以实现聚会服务的描述、网站上关于时事的文章、二手车出售列表等等信息的精确定位,机器也就能够理解这个范围内的信息。



    因为还有以下提到的几个问题,这些问题在把内容放进这些标准格式中的时候,这些问题同样会出现,而且,google不能把握住这些环节,或者从整个互联网角度来讲,把握这些环节的公司服务或者工具太分散,无法形成标准,也无法保证安全和质量。Norvig举了一个google news例子,在前一个晚上google news一共索引了658个不同来源的新闻,google可以根据这些新闻页进行一个cluster运算,算出其中重要度最高的是Blair的新闻,然而,如果google依据这些写入新闻的新闻源来做这件事情,则几乎是不可能的。
    不过通过他们的页面上的新闻来索引计算出来的质量毕竟不高,所以google现在想到另外一个办法,也就是,让用户通过google base的接口提交到google,提交的数据是定义好的一些数据标准,google来控制这个提交过程并更准确的判断提交的质量、spam等等情况,并且可以将各种数据综合起来进行分析。


    Cyc是一个专业术语,讲的是通过广泛的本题作常识推理。这样说也许不太明白,举个例子就很容易了,例如“周杰伦”,这是一个人名,如果以错输为“周杰论”,这时机器就识别不出来了,但是如果拥有了一个很大的词库,那么这个通过识别出“周杰论”可能就是“周杰伦”,那么这就是一个Cyc问题。如何在SW 中判断这些Cyc以识别出常识的判断,这是建立真正意义的SW必须解决的问题。

    顺便提及,Splog不就是Semantic Spam嘛。

    5,Semantic Web与Web2.0
    web2.0是tim o'reilly的概念,开始这个概念定义很模糊。应该是互联网应用的发展模式,催生了新一代的应用以及人们对于这些应用的理解方式和使用方法(这里谈到过这几个概念的分别)。国外也有人撰文web2.0会杀掉SW吗?,也有称Semantic Web 2.0。有很有趣的讨论。前一篇文章说得有点道理,web2.0是给少数人用的,SW会提供Accessiblity。Stefan Decker在这里补充了一下,Web2.0重“应用”,SW则是标准。这跟只说那边谈到web2.0是应用发展模式不谋而合。其实web2.0用来说明一种公司特性也未尝不可,不过你大声的说google是web2.0的公司,而M$是1.0的公司,确实有点怪。

    当然SW也作了很多应用,例如美国东岸的几所学校,例如欧洲连Semantic weblog也搞出来了,deri也做了很多应用了。

    另外,gnowsis也是另外一个狂想,只是我还没看懂它的结构图,为什么会有一个semantic web server在里面。

    有人说,Google还是从信息组织的角度来看待整个互联网(google的信条就是组织信息),或者,它只是互联网的一个信息组织者,以后也将成为SW 的信息组织者。其实,从根本来说,互联网整个媒介都是信息,除了信息没有其他任何东西,当然你可以持有另外一个观点互联网应用才是主导,这到了最深处都是殊途同归。

    互联网提供了很多破坏规则的机会。门户新闻和搜索引擎新闻已经破坏了传统媒体的规则,分类网站正在破坏一些电子商务网站和招聘网站的规则。即将露面的 http://base.google.com/服务,很可能是一个更大的破坏者,它有可能笼络更多的个人内容提供者,进而改变互联网长期以来内容的组织方式。
    其实规则很简单,就是在得到最小的spam的情况下,获得最有组织并且方便组织的信息,google实现的局部SW当然有控制,然而,SW的目标,不是web2.0那样的应用,而是Accessibility呀。 这场革命如此quiet,甚至谈不上“规则破坏”。

    (指Web Evolution causing a quiet revolution的quiet)

    再次强调一下本文的观点:很显然,google base是google在SW的试验和测试。而SW就是google的本垒(home base)。



    InfoQ SOA首席编辑胡键评《RESTful Web Services中文版》

    Web2.0 是现在很火的话题,它与Semantic Web的结合会非常有趣!

    Kendall Clark扩张地说:一想到(SPARQL+AJAX)就会头痛,因为这太Cool了!


    SPARQL over AJAX just by itself is so cool it gives me a headache!

    SPARQL: Web 2.0 Meet the Semantic Web
    Kendall Clark
Sep. 16, 2005 06:24 AM
    Sep. 16, 2005 06:24 AM

    [URL=http://www.oreillynet.com/pub/feed/31?au=896]按此在新窗口浏览图片[/URL] [URL=http://www.oreillynet.com/pub/feed/31?format=rss1&au=896]按此在新窗口浏览图片[/URL] [URL=http://www.oreillynet.com/pub/feed/31?format=rss2&au=896]按此在新窗口浏览图片[/URL]

    URL: [URL=http://www.w3.org/TR/2005/WD-rdf-sparql-protocol-20050914/]http://www.w3.org/TR/2005/WD-rdf-sparql-protocol-20050914/...[/URL]

    The Semantic Web. It's an odd duck, and not only from the publishing point of view. Academic computer science is starting to take the Semantic Web (which means, for them, webizing the Knowledge Representation part of AI) seriously. There are conferences, journals, books. Government-funded SW research, especially in the EU (but also in the US and Japan), is also on the rise.

    But in the geeky technical world, everything is about Web 2.0, not the Semantic Web. Which is fine, since there is considerable overlap between the Web, Web 2.0, and the Semantic Web. Lots of overlap, actually, and some pretty similar goals; the differences are mostly about use cases, emphasis, and some technical approach.

    Anyway, so SPARQL. RDF is pretty foundational to the Semantic Web, and it's got a data model, a formal semantics, and a concrete serialization (in XML). What it didn't have till lately was a standard query language. Imagine relational algebra and RDBMSes without SQL. Pretty hard to imagine. So the SemWeb needed a SQL. It stood up the Data Access Working Group, which has been working for about 20 months and has come up with SPARQL — an RDF query language and protocol.

    Most Web 2.0 applications and services involve a [URL=http://www.xml.com/pub/at/34]REST[/URL] protocol or interface. In other words, you can interact with the app or service by means of HTTP and manipulating resource representations, many of which are in XML, but others may be in JSON, YAML, RDF, etc.

    I think that's the way to build such apps/services, far better than an explicitly RPC-style interface. However, there is a bit of a problem. While using REST offers a standard set of operations (GET, PUT, POST, DELETE), it doesn't offer anything like a standard data manipulation language. In others words, there is no standard way to execute an arbitrary query against a Web 2.0 app or service's dataset and get back a representation of that resource or those resources.

    And, more to the point, the service or app provider has to explicitly support just those data manipulation primitives or operations which it thinks are most useful.

    That's great, but it's limiting.

    Since RDF is such a useful data representation formalism, and it now has an equally useful query language, more and more Web 2.0 sites can push more and more smarts and functionality into the place it belongs, namely, the data. REST conceptualizes (and HTTP standardizes) public interfaces; but neither does anything to standardize how one interacts, ad hoc'edly and without central control, with arbitrary slices of someone else's data.

    But SPARQL gives you precisely that, even when the data on the other end isn't really RDF, since all it has to do is support SPARQL query and map that into SQL or relational algebra or AtomStore or whatever.

    Okay, so SPARQL gives the SW and Web 2.0 a common data manipulation language in the form of expressive query against the RDF data model. Web 2.0 needs something exactly like that. (Imagine the horror of trying to get all of these totally uncoordinated Web 2.0 services and apps to support the same SQL queries? That's completely impossible. It will never happen. It may be hard to get them all to map SPARQL into how they really store data. It may never happen, in fact. But it could happen, and it will long before everyone uses the same RDBS schema.)

    What else does it need? It needs a way for those queries and their results to be schlepped back and forth between apps/services and other computer agents that want to consume those apps/services's data. In other words, the SW and Web 2.0 need a data access protocol, which is the other thing SPARQL gives the world. Using WSDL 2.0, SPARQL Protocol for RDF describes a very simple web service with one operation, query. Available with both HTTP and SOAP bindings, this operation is the way you send SPARQL queries to other sites and the way you get back the results. The HTTP bindings are REST-friendly (though perhaps not maximally so, or so says REST advocate Mark Baker. Perhaps more about that later...) and a simple SPARQL protocol client takes about 10 or 15 lines of Python code.

    So what, really, can SPARQL do for Web 2.0? Imagine having one query language, and one client, which lets you arbitrarily slice the data of Flickr, delicious, Google, and yr three other favorite Web 2.0 sites, all FOAF files, all of the RSS 1.0 feeds (and, eventually, I suspect, all Atom 1.0 feeds), plus MusicBrainz, etc.

    Damn, that's not only a lot of data, but it's a lot of the data people actually care about. That's powerful stuff.

    How powerful? Well, imagine being able to ask Flickr whether there is a picture that matches some arbitrary set of constraints (say: size, title, date, and tag); if so, then asking delicious whether it has any URLs with the same tag and some other tag yr interested in; finally, turning the results of those two distributed queries (against totally uncoordinated datasets) into an RSS 1.0 feed. And let's say you could do that with two if-statements in Python and three SPARQL queries.

    Pretty damn cool.

    What needs to be done? Well, Web 2.0 fans, builders, and advocates need more love from SW fans, builders, and advocates. These two worlds really belong together. Next, Web 2.0 apps/services need to export (or make it easier for others to wrap) an RDF interface around their data. Then we need — as Leigh Dodds mentioned to me recently — a good SPARQL client implementation in Javascript, along with some conventions for building and moving queries around in an AJAX-friendly way. SPARQL over AJAX just by itself is so cool it gives me a headache! Last, SPARQL implementations have to spread and mature, but they're already off to a very good start. There are SPARQL tools in Java, Python, and other everyday languages.

    Frankly, I'm starting to catch the scent of one of those big convergence things just possibly starting to happen. It smells like money!

    Semantic Web is a dream; Semantic Web technology is 
    the reality.
    Weblog: http://blog.w3china.org/~orangebench/

    作者认为: Web2.0 和Semantic Web 互补


    Is Web 2.0 killing the Semantic Web?
    Dan Zambonini
Oct. 07, 2005 01:44 AM
    Oct. 07, 2005 01:44 AM

    [URL=http://www.oreillynet.com/pub/feed/31?au=2379]按此在新窗口浏览图片[/URL] [URL=http://www.oreillynet.com/pub/feed/31?format=rss1&au=2379]按此在新窗口浏览图片[/URL] [URL=http://www.oreillynet.com/pub/feed/31?format=rss2&au=2379]按此在新窗口浏览图片[/URL]

    (Disclaimer: Every now and again, I like to break up my bad cartoon blogs with some provocative, opinionated, ill-informed ramblings. This is one such entry.)

    I really want the Semantic Web (SW) explosion to happen, and sooner rather than later. But a sinking feeling in the pit of my stomach tells me that it's still a long way off. And worse still, that the Web 2.0 momentum could push it further back. Let me explain.

    Web 2.0: Beautiful but deadlyWithout getting into a protracted argument over the exact definition of "Web 2.0", let's go with the general consensus that it's all about people. That's it. It doesn't care about technology or standards; use AJAX, SVG, FOAF, PHP, Ruby, XHTML, P2P or XSL - it doesn't matter, just make sure that it's people-oriented. Let people create, collaborate, share and interact. Who cares what the back-end uses, or how it does it - just give "Power To The People", quickly and efficiently.

    The Semantic Web is the polar opposite: standardise all your data in RDF; encode it in XML (OK, so there's also N3, but it's probably mostly going to end up as XML); create your OWL. And then, once you have all this standardised data, let the machines loose on it! Because this data is for computer consumption, the SW should be more or less transparent to its users.

    So whilst Web 2.0 is about high-level (user experience) and immediate benefits, the SW is a low-level (data), long-term solution. Users are seeing all this cool, flexible new Web 2.0 stuff, and it's making the SW look even more complex, rigid and unnecessary. Both technologies appear similar to the outside world - share and aggregate data - but Web 2.0 has a pretty interface, and is here and now. And thus the (finite) budgets of organisations are being spent on wikis and blogs, rather than RDF database converters.

    Semantic Web: All hail the true king!But don't write off the SW. What do we really want from the future web? I mean really want? Web 2.0 has given us more efficient maps. We can share photos. And collectively criticise the same websites. But, you know something - so what? Are these the impacts we dream about making; is this our legacy when we die? The SW could save lives. Because it could enable the identification of otherwise un-detected patterns in large-scale, distributed data sets, it could help find medical cures and aid other problems in life sciences. It could help detect and prevent organised crime and terrorist activity. It might help analyse geological or meteorological data and limit the destruction of natural disasters. It could help detect and contain viruses and outbreaks. It could help distribute and re-use important educational resources. These are bold claims, but these are the goals we should be aiming for, and this is why we need the SW to flourish. We can't let a fancy map get in our way.

    What's the way forward? Well, we need the SW to take advantage of the Web 2.0 pile-driver. As [URL=http://www.w3.org/People/all#djweitzner]Daniel Weitzner[/URL] recently told me, it's all about finding the "sweet spot" between the formal SW semantics and the flexible, free-form Web 2.0. [URL=http://www.w3.org/TeamSubmission/grddl/]GRDDL[/URL] is one such project hoping to help us find this elusive middle-ground, by re-purposing existing web content into SW data.

    We can also take advantage of the flexibility of Web 2.0. As it is technology agnostic, we can use SW technologies in our Web 2.0 applications and get the best of both worlds (the FOAF RDF vocabulary has already succeeded at being integrated into many social networking applications).

    So lets push things forward. The Web 2.0 applications are amazing, efficient, and without doubt interesting and a huge step forward. But don't let them distract from the benefits that the SW could realise. Only 10% of the world population have internet access, and those of us who regularly use Web 2.0 applications a very small niche within this. The SW benefits are further reaching; giving us developers new toys to play with, but also potentially impacting the lives of the other 6 billion people in this world without internet access.

    Dan Zambonini is the Technical Director of Box UK, a UK-based Internet Development and Consultancy company. An advocate of Semantic Web and XML technologies, he works with XML, XSL, RDF, SVG, P3P, OWL, XHTML, CSS, XForms, and a whole bunch of other acronyms.

    Semantic Web is a dream; Semantic Web technology is 
    the reality.
    Weblog: http://blog.w3china.org/~orangebench/

    2005-10-09 03:45:52  bblfish

    SPARQL and AJAX (http://blogs.sun.com/roller/page/bblfish/20050917) make for a very powerful combination. And everyone can easily participate. It even seems to have created some enthusiasm with Tim Bray (http://blogs.sun.com/roller/page/bblfish/20051003) .
    Now I think Web 2.0 is going to play fabulously with the Semantic Web.

    Semantic Web is a dream; Semantic Web technology is 
    the reality.
    Weblog: http://blog.w3china.org/~orangebench/

    更搞的提法: Semantic Web 2.0

    Semantic Web 2.0:  IAAI-05: AI Meets Web 2.0: Building The Web of Tomorrow Today
    On July 11 Marty Tenenbaum gave a buzz-generating, standing-room only presentation at [URL=http://www.aaai.org/Conferences/IAAI/2005/iaai05.html]IAAI-05[/URL], the Seventeenth Innovative Applications of Artificial Intelligence Conference. At this talk CommerceNet's [URL=http://www.commerce.net/]new Web site[/URL] was announced as well as the Semantic Web 2.0 effort, which combines Web 2.0 concepts, microformats, and AI to create the next evolution of the World-Wide Web. Interested parties are encouraged to contribute to the project wiki, which can be found at [URL=http://www.commerce.net/semweb2/]commerce.net/semweb2[/URL]. The presentation PDF is available at [URL=http://www.commerce.net/publications/]commerce.net/publications[/URL].[B][/B]

    Abstract: Imagine an Internet-scale Knowledge System where people and intelligent agents can collaborate on solving complex problems in business, engineering, science, medicine, and other endeavors. Its resources include semantically tagged Web sites, wikis, and blogs, as well as social networks, vertical search engines and a vast array of Web services from business processes to AI planners and domain models. Research prototypes of decentralized knowledge systems have been demonstrated for years, but now, thanks to the Web and Moore's Law, they appear ready for prime time. Architectural concepts for incrementally growing an Internet-scale knowledge system are introduced, with descriptions of early commercial deployments in manufacturing and healthcare.

    Semantic Web is a dream; Semantic Web technology is 
    the reality.
    Weblog: http://blog.w3china.org/~orangebench/

    Nature 11月份上一篇新闻报道,Google的数据库遍布世界各地,让人们联想到“数据网格”,也许Google将是数据网格被普通人使用的开始。 全文请看:


