
Apache Arrow是一个跨语言的开发平台,用于内存数据分析。它提供了一种标准化的列式内存格式,支持高效的数据交换和处理,适用于各种大数据处理场景。Arrow支持多种编程语言,包括C++、Python、R等,并提供了丰富的功能特性,如零拷贝读取、并行计算等。
以下是一个简单的C++示例,展示如何创建一个Arrow数组:
#include <arrow/api.h>
arrow::Int64Builder builder;
builder.Append(1);
builder.Append(2);
builder.Append(3);
std::shared_ptr<arrow::Array> array;
builder.Finish(&array);以下是一个Python示例,展示如何从Pandas DataFrame转换为Arrow表:
import pyarrow as pa
import pandas as pd
df = pd.DataFrame({'col1': [1, 2], 'col2': ['a', 'b']})
table = pa.Table.from_pandas(df)Arrow提供了丰富的API,包括:
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
#include <arrow/api.h>
arrow::Int64Builder builder;
builder.Append(1);
builder.Append(2);
builder.Append(3);
std::shared_ptr<arrow::Array> array;
builder.Finish(&array);# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
import pyarrow as pa
import pandas as pd
df = pd.DataFrame({'col1': [1, 2], 'col2': ['a', 'b']})
table = pa.Table.from_pandas(df)原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。
原创声明:本文系作者授权腾讯云开发者社区发表,未经许可,不得转载。
如有侵权,请联系 cloudcommunity@tencent.com 删除。