Mangle – 一种用于演绎数据库编程的语言
Mangle – a language for deductive database programming

原始链接: https://github.com/google/mangle

Mangle 是一种基于 Datalog 的开源编程语言,专为演绎数据库编程而设计。它擅长将来自不同来源的数据集成到一个统一的可查询系统中,并建模超越简单二元关系复杂的领域知识——类似于灵活的、机器可读的本体。 虽然 Mangle 通过聚合和函数调用等特性扩展了 Datalog,但它牺牲了一些保证(如严格终止性)以提高实用性。它允许开发者定义规则(例如,基于 log4j 版本识别易受攻击的软件),这些规则类似于 SQL 查询,但具有命名规则以实现可重用性和支持递归等优势。 Mangle 作为 Go 库实现,易于嵌入到应用程序中并鼓励实验。它特别适用于需要知识表示的任务,例如需求工程,并且可以处理二元和 n 元关系,如通过建模旅行行程的示例所示。该项目欢迎贡献,并提供文档、问答部分和演示服务以供探索。

黑客新闻 新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 Mangle – 一种演绎数据库编程语言 (github.com/google) 9 分,由 simonpure 发表于 2 小时前 | 隐藏 | 过去 | 收藏 | 讨论 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系 搜索:
相关文章

原文

Mangle is a programming language for deductive database programming. It is an extension of Datalog, with various extensions like aggregation, function calls and optional type-checking.

Deductive database programming is useful for bringing data from multiple data sources together since it enables us to represent and query that data in a uniform way. It can also be used to model domain knowledge, similar to machine-readable ontology but without being restricted to binary predicates.

Datalog is an expressive declarative language similar to relational calculus (think SQL and relational views). Unlike relational calculus, it also supports recursive rules and program structuring in a straightforward way.

Mangle contains Datalog as a fragment and adds extensions that make its use more practical. Some of the good properties like guaranteed termination are lost when such extensions are used.

The goal of Mangle as an open source project is to convey the concepts in a way that is accessible to developers and lends itself to easy experimentation. This repository contains an implementation of Mangle as a go library that can be easily embedded into applications.

Check out the docs and the GitHub discussions for more information. There is also a Q&A section.

For an example how to use Mangle library in a database-like grpc service, see the separate Mangle demo service repo.

This is not an officially supported Google product.

Imagine you were asked to spot software affected by the log4j vulnerability discovered in late 2021. We want to look for projects that contain a Java archive (jar file) of log4j that is not updated to the patched version.

projects_with_vulnerable_log4j(P) :-
  projects(P),
  contains_jar(P, "log4j", Version),
  Version != "2.17.1",
  Version != "2.12.4",
  Version != "2.3.2".

This is a Mangle rule: conceptually, the implementation retrieve all possible values for variables P and Version that make all the subgoals true.

Simple Mangle rules like this correspond to select-project-join relational queries. The same query in SQL would look like this:

SELECT projects.id as P
FROM projects JOIN contains_jar ON projects.id = contains_jar.project_id
WHERE contains_jar.version NOT IN ("2.17.1", "2.12.4", "2.3.2")

Unlike SQL, our Mangle rule projects_with_vulnerable_log4j has a name and can be referenced in other queries.

(If translating non-recursive Datalog into SQL queries sounds interesting, you should check out the Logica open source project.)

In practice, querying is rarely enough and we also need grouping and aggregation.

count_projects_with_vulnerable_log4j(Num) :-
  projects_with_vulnerable_log4j(P) |> do fn:group_by(), let Num = fn:Count().

The example does not specify what contains_jar does. Here is a possible implementation for contains_jar that walks a dependency graph. This shows that Mangle rules can be recursive.

contains_jar(P, Name, Version) :-
  contains_jar_directly(P, Name, Version).

contains_jar(P, Name, Version) :-
  project_depends(P, Q),
  contains_jar(Q, Name, Version).

The two rules correspond to two cases in which a project may "contain" a jar: either directly, or through some dependency.

Knowledge Graphs, Property Graphs

In requirements engineering, one needs to captures real world concepts in a domain model and controlled vocabulary. Description logics use roles to describe how concepts interact, but these relationships are always binary. Mangle can represent binary predicates, but also arbitrary n-ary relations. Moreover it also has support for structured data.

one_or_two_leg_trip(Codes, Start, Destination, Price) :-
  direct_conn(Code, Start, Destination, Price)
  |> let Codes = [Code].

one_or_two_leg_trip(Codes, Start, Destination, Price) :-
  direct_conn(FirstCode, Start, Connecting, FirstLegPrice).
  direct_conn(SecondCode, Connecting, Destination, SecondLegPrice)
  |> let Code = [FirstCode, SecondCode],
     let Price = fn:plus(FirstLegPrice, SecondLegPrice).

graph LR
    /zurich -->|/code/ZL <br /> 60 CHF| /lausanne
    /zurich -->|/code/ZB <br /> 30 CHF| /bern
    /bern -->|/code/BL <br /> 30 CHF| /lausanne
Loading

Get the dependencies (see go.mod), build the library, run tests:

go get -t ./...
go build ./...
go test ./...

Regenerating the parser sources

If you want to regenerate the parser sources, you need to set up ANTLR first. This requires a Java runtime environment.

wget http://www.antlr.org/download/antlr-4.13.2-complete.jar
alias antlr='java -jar $PWD/antlr-4.13.2-complete.jar'
antlr -Dlanguage=Go -package gen -o ./ parse/gen/Mangle.g4 -visitor

The Mangle maintainers welcome external contributions to spec, documentation and this implementation (see CONTRIBUTING.md) and also other implementations. Pull requests will be handled like for tensorflow, to ensure our internal usage and tests will pass.

联系我们 contact @ memedata.com