News.EOS.WiKi Bilingual News & Info Of EOS

为什么区块链是更好的应用服务器/数据库架构/Why A Blockchain is a Better Application Server / Database Architecture

译文/Translated:

传统web应用基础架构设计是以事后修补的态度对待安全问题的,25年多以来,企业都在试图修补这个基础架构,但它从根本上就是不安全的。这个基础架构的假定是,服务器可以被人们信任,也能被保证安全,但是多年经验却告诉我们,没有哪个服务器可以免受外部供给的影响,更不用说是内部破坏的影响了。换言之,服务器从根本上就是中心化的。

我们曾经以为“问题”出在用户和服务器的连接上,所以我们引入了SSL和HTTPS协议。但是后来我们发现,黑客还是会破坏数据库、窃取密码,所以我们开始储存密码的哈希值。但接着我们又发现,黑客可以在窃取哈希值后暴力破解密码。于是我们有了密码轮换,这样如果它被暴力破解了,密码就会更改。如此循环下去。

企业花了数十亿美元保护自己的服务器和数据库,但尽管如此,要审计系统、保证企业按计划运营依然是不容易的事情。

Block.one在建立的区块链软件能保证数据库和用户账户安全,防止未授权的访问或篡改。区块链用户使用高度安全的私钥,这些私钥储存在安全的硬件中,每个用户交互都进行签名,而不是简单地对服务器验证链接。区块链创造了不可转换日记,建立了绝对和确定性的、用户输入的命令,而智能合约提供确定性的商业逻辑,保证所有系统的一致性。

Block.one在创建的未来会消灭密码和昂贵的审计,能帮企业节省几十亿资金、避免盗号、为所有人提供更高的可信度和审计能力。

几年来,我一直相信,每个多用户网站都会因为采用区块链后端而受益。和大众的观点不同,区块链不一定又慢又没效率,不一定只是数据库,也不一定只能在反审查、开放访问的基础上运行。区块链可以为企业在安全性、审计能力、透明性和商业流程的整体统一性上带来极大的改进,哪怕这个区块链是完全由公司本身运营且区块链的所有内容都没公开也是如此。本文想探讨企业环境下区块链的价值,也为区块链行业的未来指明方向。

常见误区

区块链行业中,很多人觉得,只有当区块链连接很多相互不信任的对象的时候,区块链才会提供好处。他们认为,传统的数据库技术已经能够满足企业完整统一的所有要求。换言之,他们认为传统的数据库复制和“数据完整”保证就意味着足够。这个过程中,他们要么是忽略了,要么就是不知道,区块链提供的完全不同的安全性和完整性保证:

  1. 保证全球时间顺序
  2. 业务逻辑的确定性执行
  3. 业务逻辑和数据完整性的紧耦合
  4. 消除密码

传统的商业应用基础架构中,商业逻辑和数据库是分开的。一般来说会存在应用服务器,如Node.js或J2EE,用户要向它提供密码来修改数据库。Node.js服务器的作用是通过密码或多因素验证机制验证用户。应用服务器验证了用户之后,它会提供一个会话令牌,用来验证未来的用户交互,直到会话超时或者会话的一些元素(如IP)发生变化。

显然,这种传统设计通过一个单一由应用服务器管理的登录/密码实现所有的数据库操作。应用服务器要负责用最终的终端使用来执行验证机制。同样也很明显的是,一般会有多方能够获得用户名和密码。数据库管理员可以给很多不同的应用服务器和/或个人委派和撤销凭证。

先进的系统则保证,横向规模化的系统中,每个应用服务器都有自己的用户名/密码,在某些情况下,甚至可以利用公钥基础架构和硬件安全模块(HSMs)。但是,哪怕是这种情况下,数据库只向应用服务器验证了链接。为了提供审计日志,它还要记录完整的安全连接的数据流。但是,哪怕是这个日记,也只记录了应用服务器要求的“读取和写入”,而表达给应用服务器的最初的用户意图的所有相关信息却已经丢失了。

审计在审核这样一个系统的时候没有任何方法获悉应用服务器(如Node.js)是否遵循合理的商业逻辑并合理地验证了最终的终端用户。Node.js进程可以把用户操作写入数据库的“日记“,这样,审计可以尝试重现相同的计算,但是这样的日记本身并非无法篡改,它自己也没有独立的可验证的身份验证器,无法验证存疑的终端用户是不是实际上授权了它记录的操作。

人们可以尝试记录每个用户连接,但是因为用户经常改变连接传输密码,所以这样的日记最后反而成为泄露用户凭证的蜜罐。复杂一些的系统可以加密这些日记,只允许审计员阅读。

假设审计日记没有被篡改,审计就要通过应用逻辑运行同样的操作序列,从而验证结果数据库状态是否能够匹配。这意味着,应用服务器必须以确定性的方法实现。

确定性计算很难

尽管写确定性代码看上去很“容易“,实践中所有常用的编程语言都是非确定性的,因为它们可以让开发者获取数据库外的信息。可能是很简单的时间戳、内存地址、环境变量、IP地址,也可以是更细微的,如硬件上的浮点行为或者哈希表的插入顺序。很多情况下,只是简单访问长时间运行的应用服务器的内存变量就足够引入非确定性了。启动/停止应用服务器必须被记录日记并重现,否则回放的时候每个本地内存访问都可能是非确定性的。

实际上,最好的开发者是在常见的陷阱中训练出来的,他们会积极地寻找非确定性,而写确定性代码就变得非常有挑战性。让普通的企业应用开发者写确定性代码是非常困难,或者也是不切实际的。

让我们进一步假设应用代码是确定性的,应用能够忠实地记录用户事件,那我们也还有一个难题:如何追踪在任何一个特定时间部署的代码版本。应用是动态的,也是会经常更新的,因此,应用代码本身必须也是数据库状态的一部分,其更新也应该以和用户操作同等的安全和审计能力程度进行管理和记录。审计可能就需要复制应用服务器代码的所有版本,并通过每个版本的升级重放用户输入的指令(并在过去每次重启时跟着重启代码)。

即使一个单一的应用服务器在应用和部署上都能以确定性的模式运行,它也会遇到严重的规模化问题。只有一个应用服务器实例能够在数据库上运行。并行访问可以通过复杂的锁实现,但是即使是锁上的竞争条件也必须被记录在日记并重现,否则两不同本地变量的应用逻辑有两个实例,这就会产生非确定性输出。

此时,你可能会向干脆把所有确定性都去掉吧,但是没有确定性的话,一个小小的差异,随着时间的发展,会在最终的数据集最终演化成巨大的差异。审计员只能使用模糊逻辑和近似匹配,而且每个人都还得相信,这种“模糊逻辑“就够了。

当然了,花了这么多精力编写和部署确定性代码,但想毁掉这一切却只需要数据库管理员直接修改数据库,且是悄无声息地修改。某些时候,认真更新的用户输入日记和状态会产生两个不同的数据库状态,每个都通过了确定性测试,但是却产生不同且无法调和的输出结果。比如,假定教授把某学生的成绩,F,提交给系统,而学生入侵或贿赂进入数据库修改他的成绩和教授提交的日记。

更换密码

任何关注完整性的多用户系统的最终目标都是保证用户输入的内容不会被伪造。使用用户名/密码甚至其他主观的多因素验证方法(如SMS或谷歌验证器)都依赖服务器判断密码是否匹配或是否输入了正确的验证码/邮件连接/双重验证码。很明显,对于系统的完整性来说这就是一个巨大的问题,但是我还是想提出一个真实世界的案例来展示这样的系统能有多糟糕。

2016年我在一个加密货币交易所开了一个账号,账号被黑了以后,黑客窃取了几万美元的比特币。从我的视角看,这次黑客活动显示了有一封“密码重置”的邮件发到我的邮箱里,接着又有一封邮件显示我的密码已经成功更改,再后来,我又收到一封邮件确认提现比特币(带有代码/链接)。最后我收到交易完成的通知。

一开始,看上去是我的邮箱账户被黑了,但是因为我在邮箱使用了多因素登录,所以这不大可能。然后我快速浏览了我的邮箱安全页,发现邮箱并没有任何未授权的访问。我能这么判断是因为Google记录日记并提供了所有登录我邮箱的IP地址和设备。

真实情况是,黑客在邮件到达邮箱之前就拦截了交易所发来的这封邮件。应用服务器并不知道邮件被拦截了,因此授权了密码重置,并仅因为黑客有应用服务器产生的一次性验证码而授权提现。

针对SMS或者任何其它不以依赖用户控制的私钥的技术都可能受到相同方法的攻击。最终,唯一能真正保证用户账户安全的,就是所有的用户都采用基于硬件的私钥作为登录凭证,同时,当硬件密钥丢失的时候,必须通过稳健但是耗时的方法来达成安全重置。

这个时候,多用户企业应用就可以利用用户的私钥签署每个用户请求,在数据库中记录本次签署要求,并利用确定性代码处理。但哪怕这样,它还是不能完全满足人们预期的完整性,因为整个用户求情还是可能被删除且带有其它副作用。假设我黑了警方数据库并删除了警察在提交用户票证时签署的请求,那怎么办呢?

这时候,精明的工程师会说,我提出的每个小问题都可以通过更改应用逻辑解决。他说的可能没错,好的应用开发者可以利用“传统数据库”、“传统应用服务器”和“通用加密原语”来构建相对安全和可审计的系统。基于同样的逻辑,精明的工程师会说,数据库是根本没有必要的,所有的内容直接构建在文件夹上就可以了。其他工程师可能会说,从头编写所有的代码,而不是依赖某个如Node.js和J2EE的应用服务器框架,就可以提高性能。好像所有的事情都是从低层次的技术构建的,我们是不是干脆要从晶管体开始设计呢?

我会说的这么极端,因为它突出了高层次框架在加速和保证新应用开发过程的真正功能。很少人会真的去写自己的加密原语库或算法,这么去做的要么是专家,要么就是系统被入侵时候的警戒尾巴。开发/重构一切的话,每个应用的成本远高于建立在业已成熟的框架。

区块链应用/数据库服务器的好处

区块链和EOSIO等开发框架存在的原因是要解放应用开发者,让他们不用再为了建立安全的应用重新开发“数据库”。安全性和确定性很难,所以技术才建立在模糊了细节的层面上。EOSIO结合了确定性执行环境(WebAssembly)和进程中的快速数据库。所有用户操作都用自己的私钥签署,并记录在复制过的分布式数据库中,能够公开向区块头做出承诺。

EOSIO这类框架要变得像传统但不安全的系统一样强大和简单开发只是时间的问题。在很多方面,EOSIO的基础架构已经比传统系统有更好的性能,因为它简单地把应用逻辑(WebAssembly)和内存数据库相同的处理空间中。这就创造了更厉害的确定性存储步骤。

未来几年,Block.one打算增加一些工具和交互界面,让你把企业应用部署在区块链变得和在传统企业应用基础架构上部署应用一样简单,甚至更加简单。

很明显,区块链技术的应用应该成为政府机关、上市公司、和有责任防止欺诈和/或做财务报告的企业的重点。未来几年不使用区块链技术就像像现在不使用SSL的银行一样,一旦该技术实现大规模应用,那么不使用区块链技术就会,在我来看,被认为是失职。

到了行动的时候了。你的企业和用户都不安全,如果没有对建立应用的方式做实质性的修改,他们更不会变得安全。你拖延一天,你的企业就在欺诈和黑客入侵风险中多暴露一天。

原文/Original:

Traditional web application infrastructure was designed with security as an afterthought and for over 25 years companies have been attempting to patch a fundamentally insecure architecture. This architecture was designed with the assumption that a server could be trusted and secured, but years of experience has taught us that no server is safe from external attacks, let alone internal compromises. Stated another way, a server is fundamentally centralized.

We used to think that the “problem” was the connection between the user and the server and so we introduced SSL and HTTPS. But then we discovered that hackers would compromise the database and steal passwords. So we set about storing hashes of passwords, but then we discovered hackers would brute force the password after stealing the hashes. Then we introduced password rotation so that the password would be changed by the time it was brute forced, and on and on.

Businesses are spending billions of dollars attempting to protect their servers and databases and despite all of this effort there is still no easy way to audit systems and ensure businesses operate as they are intended.

Block.one is building blockchain software to keep databases and user accounts secure against unauthorized access and unaccounted for modifications. With blockchain users adopt highly secure private keys which are stored in secure hardware and used to sign every user interaction rather than simply authenticate a connection to a server. The blockchain creates an immutable log establishing an absolute and deterministic order in which user inputs were received and smart contracts provide deterministic business logic which ensures consistency across all systems.

The future Block.one is creating eliminates passwords and expensive auditing saving companies billions of dollars, preventing identity theft, and offering increased reliability and audit-ability for all.

I have maintained for years that every multi-user website can benefit from the adoption of a blockchain backend. Contrary to popular opinion, blockchains don’t have to be slow, inefficient, databases nor do they have to operate on a censorship resistant, open access basis. A blockchain can provide a company huge improvements in security, audit-ability, transparency, and overall integrity of the business process even if the blockchain is entirely operated by the company itself and none of the contents of the blockchain are made public. This article aims to shine a light on the true value of blockchain in corporate environments and show the way forward for the blockchain industry.

Common Misconceptions

Within the blockchain industry many people are of the opinion that blockchains only provide benefit when they connect many parties who don’t trust each other. They have the opinion that traditional database technologies can already do everything required to ensure business integrity. Stated another way, they view traditional database replication and “data integrity” guarantees as sufficient. In the process they either ignore or are ignorant of the fundamentally different security and integrity guarantees that blockchains offer:

  • Commitment to Global Sequence of Events
  • Deterministic Execution of Business Logic
  • Tight Coupling of Business Logic & Data Integrity
  • Elimination of Passwords

In traditional business application architectures the business logic is separated from the database. There is usually an application server, such as Node.js or J2EE, which is provided a password to modify the database. It is the role of the Node.js server to authenticate the user via a password or multi-factor authentication scheme. Once the application server authenticates the user it issues a session token which is used to authenticate future user interactions until it times out or some element of the session (such as IP) changes.

It should be obvious that this traditional design performs all database operations via a single login/password managed by the application server. The application server is responsible for implementing its own authentication scheme with the ultimate end use. It should also be obvious that there are usually multiple parties who could gain access to the username and password. The database admin can assign and revoke credentials to many different application servers and/or individuals.

Advanced systems ensure that in a horizontally scaled system each application server has its own username/password and in some cases it could even use a Public Key Infrastructure and Hardware Security Modules (HSMs). However, even here the database only authenticates the connection to the application server. In order to provide an audit log, it would have to record the entire datastream of the secure connection. However, even this log only records the “reads and writes” requested by the application server which has already lost all information about the original users intent as expressed to the application server.

An auditor reviewing such a system would have no way of knowing whether the application server (e.g. Node.js) was following the proper business logic and properly authenticated the ultimate end user. The Node.js process could “log” user actions into the database so that an auditor could attempt to reproduce the same calculations, but such a log is not tamper-resistant on its own and carries with it no independently verifiable authentication that the end user in question actually authorized the actions it logs.

An attempt could be made to log every user connection, but since the user often transmits their password over the connection, these logs would end up creating a honeypot for leaking user credentials. More sophisticated systems could encrypt these logs so that only the auditor can read them.

Assuming the audit log was not tampered with, the auditor would have to run the same action sequence through the application logic to verify that the resulting database state matched. This implies that the application server would have to be implemented in a deterministic manner.

Deterministic Computation is Hard

While it may appear to be “simple” to write deterministic code, in practice all common computer languages are non-deterministic because they allow the developer to access data outside what is stored in the database. This could be something as simple as a timestamp, memory address, environment variable, IP address, or something far more subtle such as the behavior of floating point on the hardware or the order of insertion to a hashtable. In many cases simply accessing in-memory variables of a long-running application server is sufficient to introduce non-determinism. The very act of starting/stopping the application server must be logged and reproduced or every local memory access is potentially non-deterministic during replay.

The truth of the matter is that writing deterministic code is challenging for the best developers trained in the common pitfalls and actively looking for nondeterminism. The typical business application developer would find it difficult or impractical to write in a deterministic manner.

If we go further and assume the application code was deterministic, that the application faithfully recorded user events, we are still left with the challenge of tracking the version of the code deployed at any given time. Applications are dynamic and frequently updated, therefore the application code itself must also be part of the database state and its updates managed and logged with the same degree of security and audit-ability as user actions. An auditor would then need to have a copy of all versions of the application server code and replay the user inputs through each version upgrading as necessary (and restarting the code whenever it was restarted in the past)

Even if a single application server was able to operate in a deterministic manner both in terms of its implementation and deployment, it would still suffer from a major scalability concern. Only one instance of the application server could ever operate on the database. Parallel access could be implemented via complex locks, but even the race conditions on the locks would have to be logged and reproduced or two instances of the application logic with different local variables could produce non-deterministic output.

At this point one might be tempted to discard determinism altogether, but absent determinism a single bit of difference could compound over time into massive differences in the final data set. Auditors would be forced to use fuzzy logic and approximate matches and everyone would have to trust that the “fuzzy logic” was good enough.

Of course the only thing it takes to negate all of the effort put into writing and deploying deterministic code is for the database administrator to modify the database directly and without a trace. In some cases a careful updating of the user input log and state can create two different database states that each pass the deterministic test but nevertheless have different and irreconcilable outputs. For example, suppose a professor submits a students grade to the system as an F, and then the student hacks/bribes his way onto the database to change both his grade and the log as submitted by the professor.

Replacing Passwords

The ultimate goal of any multiuser system that cares about integrity is ensuring that user inputs cannot be forged. The use of username/password or even other subjective multi-factor authentication (such as SMS or Google Authenticator) relies upon the server to conclude that the password matches or that the proper SMS code / email link / authenticator code was entered. It should be obvious that this is a huge problem for the integrity of the system, but just in case I will provide a real world example of just how bad these systems can be.

In 2016 I had an account on a cryptocurrency exchange which was hacked and allowed the hacker to steal tens of thousands of dollars worth of Bitcoin. The hack as observed from my perspective showed up as a “password reset” email being sent to my email, followed by another email that indicated my password had been successfully reset. Then I received an email asking for confirmation of the withdrawal of my bitcoin (with a code/link). Then I received a notice that my withdrawal had been processed.

At first glance it would appear that my email account had been hacked, but that shouldn’t have been possible given my use of multi-factor login on my email. A quick visit to my email security page showed that there was no unauthorized access to my email. I could tell because Google logs and displays all IPs/devices that access my email.

What had happened is that the attacker intercepted the email as the exchange was sending it before it got to my email account. The application server had no way of knowing that the email was intercepted and therefore authorized the password reset and withdraw based only on the attacker having the one time code generated by the application server.

This same exploit could be used against SMS or any other technique that relies upon something other than a private key controlled by the user. At the end of the day, the only way to really secure a user’s account is for all users to adopt hardware based private keys as their login credentials combined with a robust and time consuming process to facilitate a secure reset in the event the hardware keys are lost.

At this point a multi-user business application could now sign every user request with the user’s private key, log this signed request in the database, and process it with deterministic code. Even this does not provide the integrity one expects because the entire user request could still be deleted along with any side effects. Imagine hacking the police database and removing the signed request by a police officer when he submitted your ticket followed by all state derived from that request?

At this point an astute engineer would claim that every single problem I raise can be solved by changing the application logic. He would be right, a sophisticated application developer could use a “traditional database”, “traditional application server”, and “common cryptographic primitives” and construct a system that is relatively secure and audit-able. By this same logic an astute engineer could claim that a database is entirely unnecessary and instead everything should be built directly on the file system. Another engineer would point out that performance could be enhanced by coding everything from scratch instead of relying upon application server frameworks like Node.js and J2EE. It is almost as if everything is built out of lower level tech and we might as well design in transistors for maximum performance.

I point to this extreme because it highlights the true utility of higher-level frameworks in accelerating and securing the development of new applications. Few people write their own cryptography libraries or algorithms and those who do are either experts or serve as a cautionary tail when their system is hacked. The cost of developing/redeveloping everything from the ground up can make every application more expensive than building on top of proven frameworks.

The Benefit of Blockchain Application/Database Servers

The reason blockchains and development frameworks like EOSIO exist is to free the application developers from having to reinvent the “database” in order to build secure applications. Security and determinism is hard, which is why technology is built in layers that abstract the details. EOSIO combines a deterministic execution environment (WebAssembly) with a fast database in the same process. All user actions are signed by their own private keys and logged in a replicated and distributed database with the ability to make public commitments to the block headers.

It is only a matter of time before frameworks like EOSIO are just as powerful and easy to develop for as traditional insecure systems. In many ways, the architecture of EOSIO is already higher performance than traditional systems by the simple virtue of putting the application logic (Web Assembly) in the same process space as the in-memory database. This creates deterministic stored procedures on steroids.

In the years ahead, Block.one aims to add the tools and interfaces to make deploying your business applications on top of blockchain as easy (or easier) than deploying those same applications on top of traditional business application architectures.

It is clear that adoption of blockchain technology should become a priority for government agencies, public companies, and businesses that have a duty to prevent fraud and/or do financial reporting. Not adopting blockchain tech in the years ahead will be like a bank not adopting SSL and once the technology is widely available not using blockchain technology could, in my view, be considered negligence.

Today is the time to act. Your businesses and users are not secure and will not be secure without a fundamental change in the way our applications are built. Every day you delay is another day your business is at risk of fraud or hacking.

原文链接/Original URL:

https://medium.com/@bytemaster/why-a-blockchain-is-a-better-application-server-database-architecture-9d7b78730fbb

About the author

By user
News.EOS.WiKi Bilingual News & Info Of EOS

Recent Posts