glusterfs/configure.ac

943 lines
31 KiB
Plaintext
Raw Normal View History

dnl Copyright (c) 2006-2012 Red Hat, Inc. <http://www.redhat.com>
dnl This file is part of GlusterFS.
2009-02-18 17:36:07 +05:30
dnl
dnl This file is licensed to you under your choice of the GNU Lesser
dnl General Public License, version 3 or any later version (LGPLv3 or
dnl later), or the GNU General Public License, version 2 (GPLv2), in all
dnl cases as published by the Free Software Foundation.
2009-02-18 17:36:07 +05:30
AC_INIT([glusterfs],[3git],[gluster-users@gluster.org],,[https://github.com/gluster/glusterfs.git])
2009-02-18 17:36:07 +05:30
AM_INIT_AUTOMAKE
m4_ifdef([AM_SILENT_RULES], [AM_SILENT_RULES(yes)])
if make --help 2>&1 | grep -q no-print-directory; then
AM_MAKEFLAGS="$AM_MAKEFLAGS --no-print-directory";
fi
if make --help 2>&1 | grep -q quiet; then
AM_MAKEFLAGS="$AM_MAKEFLAGS --quiet"
fi
if libtool --help 2>&1 | grep -q quiet; then
AM_LIBTOOLFLAGS="--quiet";
fi
AC_CONFIG_HEADERS([config.h])
2009-02-18 17:36:07 +05:30
AC_CONFIG_FILES([Makefile
libglusterfs/Makefile
libglusterfs/src/Makefile
geo-replication/src/peer_gsec_create
glusterd/cli changes for distributed geo-rep Commands: gluster system:: execute gsec_create gluster volume geo-rep <master> <slave-url> create [push-pem] [force] gluster volume geo-rep <master> <slave-url> start [force] gluster volume geo-rep <master> <slave-url> stop [force] gluster volume geo-rep <master> <slave-url> delete gluster volume geo-rep <master> <slave-url> config gluster volume geo-rep <master> <slave-url> status The geo-replication is distributed. The session will be created, and gsyncd will be spawned on all relevant nodes, instead of only one node. geo-rep: Collecting status detail related data Added persistent store for saving information about TotalFilesSynced, TotalSyncTime, TotalBytesSynced Changes in the status information in socket: Existing(Ex): FilesSynced=2;BytesSynced=2507;Uptime=00:26:01; New(Ex): FilesSynced=2;BytesSynced=2507;Uptime=00:26:01;SyncTime=0.69978; TotalSyncTime=2.890044;TotalFilesSynced=6;TotalBytesSynced=143640; Persistent details stored in /var/lib/glusterd/geo-replication/${mastervol}/${eSlave}-detail.status Change-Id: I1db7fc13ffca2e415c05200b0109b1254067f111 BUG: 847839 Original Author: Avra Sengupta <asengupt@redhat.com> Original Author: Venky Shankar <vshankar@redhat.com> Original Author: Aravinda VK <avishwan@redhat.com> Original Author: Amar Tumballi <amarts@redhat.com> Original Author: Csaba Henk <csaba@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5132 Reviewed-by: Vijay Bellur <vbellur@redhat.com> Tested-by: Vijay Bellur <vbellur@redhat.com>
2013-07-10 17:32:41 +05:30
geo-replication/src/peer_add_secret_pub
glusterfsd/Makefile
glusterfsd/src/Makefile
rpc/Makefile
rpc/rpc-lib/Makefile
rpc/rpc-lib/src/Makefile
rpc/rpc-transport/Makefile
rpc/rpc-transport/socket/Makefile
rpc/rpc-transport/socket/src/Makefile
rpc/rpc-transport/rdma/Makefile
rpc/rpc-transport/rdma/src/Makefile
rpc/xdr/Makefile
rpc/xdr/src/Makefile
xlators/Makefile
xlators/mount/Makefile
xlators/mount/fuse/Makefile
xlators/mount/fuse/src/Makefile
xlators/mount/fuse/utils/mount.glusterfs
xlators/mount/fuse/utils/mount_glusterfs
xlators/mount/fuse/utils/Makefile
xlators/storage/Makefile
xlators/storage/posix/Makefile
xlators/storage/posix/src/Makefile
bd: posix/multi-brick support to BD xlator Current BD xlator (block backend) has a few limitations such as * Creation of directories not supported * Supports only single brick * Does not use extended attributes (and client gfid) like posix xlator * Creation of special files (symbolic links, device nodes etc) not supported Basic limitation of not allowing directory creation is blocking oVirt/VDSM to consume BD xlator as part of Gluster domain since VDSM creates multi-level directories when GlusterFS is used as storage backend for storing VM images. To overcome these limitations a new BD xlator with following improvements is suggested. * New hybrid BD xlator that handles both regular files and block device files * The volume will have both POSIX and BD bricks. Regular files are created on POSIX bricks, block devices are created on the BD brick (VG) * BD xlator leverages exiting POSIX xlator for most POSIX calls and hence sits above the POSIX xlator * Block device file is differentiated from regular file by an extended attribute * The xattr 'user.glusterfs.bd' (BD_XATTR) plays a role in mapping a posix file to Logical Volume (LV). * When a client sends a request to set BD_XATTR on a posix file, a new LV is created and mapped to posix file. So every block device will have a representative file in POSIX brick with 'user.glusterfs.bd' (BD_XATTR) set. * Here after all operations on this file results in LV related operations. For example opening a file that has BD_XATTR set results in opening the LV block device, reading results in reading the corresponding LV block device. When BD xlator gets request to set BD_XATTR via setxattr call, it creates a LV and information about this LV is placed in the xattr of the posix file. xattr "user.glusterfs.bd" used to identify that posix file is mapped to BD. Usage: Server side: [root@host1 ~]# gluster volume create bdvol host1:/storage/vg1_info?vg1 host2:/storage/vg2_info?vg2 It creates a distributed gluster volume 'bdvol' with Volume Group vg1 using posix brick /storage/vg1_info in host1 and Volume Group vg2 using /storage/vg2_info in host2. [root@host1 ~]# gluster volume start bdvol Client side: [root@node ~]# mount -t glusterfs host1:/bdvol /media [root@node ~]# touch /media/posix It creates regular posix file 'posix' in either host1:/vg1 or host2:/vg2 brick [root@node ~]# mkdir /media/image [root@node ~]# touch /media/image/lv1 It also creates regular posix file 'lv1' in either host1:/vg1 or host2:/vg2 brick [root@node ~]# setfattr -n "user.glusterfs.bd" -v "lv" /media/image/lv1 [root@node ~]# Above setxattr results in creating a new LV in corresponding brick's VG and it sets 'user.glusterfs.bd' with value 'lv:<default-extent-size' [root@node ~]# truncate -s5G /media/image/lv1 It results in resizig LV 'lv1'to 5G New BD xlator code is placed in xlators/storage/bd directory. Also add volume-uuid to the VG so that same VG can't be used for other bricks/volumes. After deleting a gluster volume, one has to manually remove the associated tag using vgchange <vg-name> --deltag <trusted.glusterfs.volume-id:<volume-id>> Changes from previous version V5: * Removed support for delayed deleting of LVs Changes from previous version V4: * Consolidated the patches * Removed usage of BD_XATTR_SIZE and consolidated it in BD_XATTR. Changes from previous version V3: * Added support in FUSE to support full/linked clone * Added support to merge snapshots and provide information about origin * bd_map xlator removed * iatt structure used in inode_ctx. iatt is cached and updated during fsync/flush * aio support * Type and capabilities of volume are exported through getxattr Changes from version 2: * Used inode_context for caching BD size and to check if loc/fd is BD or not. * Added GlusterFS server offloaded copy and snapshot through setfattr FOP. As part of this libgfapi is modified. * BD xlator supports stripe * During unlinking if a LV file is already opened, its added to delete list and bd_del_thread tries to delete from this list when a last reference to that file is closed. Changes from previous version: * gfid is used as name of LV * ? is used to specify VG name for creating BD volume in volume create, add-brick. gluster volume create volname host:/path?vg * open-behind issue is fixed * A replicate brick can be added dynamically and LVs from source brick are replicated to destination brick * A distribute brick can be added dynamically and rebalance operation distributes existing LVs/files to the new brick * Thin provisioning support added. * bd_map xlator support retained * setfattr -n user.glusterfs.bd -v "lv" creates a regular LV and setfattr -n user.glusterfs.bd -v "thin" creates thin LV * Capability and backend information added to gluster volume info (and --xml) so that management tools can exploit BD xlator. * tracing support for bd xlator added TODO: * Add support to display snapshots for a given LV * Display posix filename for list-origin instead of gfid Change-Id: I00d32dfbab3b7c806e0841515c86c3aa519332f2 BUG: 1028672 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/4809 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
2013-11-13 22:44:42 +05:30
xlators/storage/bd/Makefile
xlators/storage/bd/src/Makefile
xlators/cluster/Makefile
xlators/cluster/afr/Makefile
xlators/cluster/afr/src/Makefile
xlators/cluster/stripe/Makefile
xlators/cluster/stripe/src/Makefile
xlators/cluster/dht/Makefile
xlators/cluster/dht/src/Makefile
xlators/performance/Makefile
xlators/performance/write-behind/Makefile
xlators/performance/write-behind/src/Makefile
xlators/performance/read-ahead/Makefile
xlators/performance/read-ahead/src/Makefile
xlators/performance/readdir-ahead/Makefile
xlators/performance/readdir-ahead/src/Makefile
xlators/performance/io-threads/Makefile
xlators/performance/io-threads/src/Makefile
xlators/performance/io-cache/Makefile
xlators/performance/io-cache/src/Makefile
xlators/performance/symlink-cache/Makefile
xlators/performance/symlink-cache/src/Makefile
xlators/performance/quick-read/Makefile
xlators/performance/quick-read/src/Makefile
xlators/performance/open-behind/Makefile
xlators/performance/open-behind/src/Makefile
xlators/performance/md-cache/Makefile
xlators/performance/md-cache/src/Makefile
xlators/debug/Makefile
xlators/debug/trace/Makefile
xlators/debug/trace/src/Makefile
xlators/debug/error-gen/Makefile
xlators/debug/error-gen/src/Makefile
xlators/debug/io-stats/Makefile
xlators/debug/io-stats/src/Makefile
xlators/protocol/Makefile
xlators/protocol/auth/Makefile
xlators/protocol/auth/addr/Makefile
xlators/protocol/auth/addr/src/Makefile
xlators/protocol/auth/login/Makefile
xlators/protocol/auth/login/src/Makefile
xlators/protocol/client/Makefile
xlators/protocol/client/src/Makefile
xlators/protocol/server/Makefile
xlators/protocol/server/src/Makefile
xlators/features/Makefile
features/changelog: changelog translator This is the initial version of the Changelog Translator. What is it ----------- Goal is to capture changes performed on a GlusterFS volume. The translator needs to be loaded on the server (bricks) and captures changes in a plain text file inside a configured directory path (controlled by "changelog-dir", should be somewhere in <export>/.glusterfs/changelog by default). Changes are classified into 3 types: - Data: : TYPE-I - Metadata : TYPE-II - Entry : TYPE-III Changelog file is rolled over after a certain time interval (defauls to 60 seconds) after which a changelog is started. The thing to be noted here is that for a time interval (time slice) multiple changes for an inode are recorded only once (ie. say for 100+ writes on an inode that happens within the time slice has only a single corresponding entry in the changelog file). That way we do not bloat up the changelog and also save lots of writes. Changelog Format ----------------- TYPE-I and TYPE-II changes have the gfid on the entity on which the operation happened. TYPE-III being a entry op requires the parent gfid and the basename. Changelog format has been kept to a minimal and it's upto the consumers to do the heavy loading of figuring out deletes, renames etc.. A single changelog file records all three types of changes, with each change starting with an identifier ("D": DATA, "M": METADATA and "E": ENTRY). Option is provided for the encoding type (See TUNABLES). Consumers ---------- The only consumer as of today would be geo-replication, although backup utilities, self-heal, bit-rot detection could be possible consumers in the future. CLI ---- By default, change-logging is disabled (the translator is present in the server graph but does nothing). When enabled (via cli) each brick starts to log the changes. There are a set of tunable that can be used to change the translators behaviour: - enable/disable changelog (disabled by default) gluster volume set <volume> changelog {on|off} - set the logging directory (<brick>/.glusterfs/changelogs is the default) gluster volume set <volume> changelog-dir /path/to/dir - select encoding type (binary (default) or ascii) gluster volume set <volume> encoding {binary|ascii} - change the rollover time for the logs (60 secs by default) gluster volume set <volume> rollover-time <secs> - when secs > 0, changelog file is not open()'d with O_SYNC flag - and fsync is trigerred periodically every <secs> seconds. gluster volume set <volume> fsync-interval <secs> features/changelog: changelog consumer library (libgfchangelog) A shared library is provided for the consumer of the changelogs for easy acess via APIs. Application can link against this library and request for changelog updates. Conversion of binary logs to human-readable ascii format is also taken care by the library which keeps a copy of the changelog in application provided working directory. Change-Id: I75575fb7f1c53d2bec3dba1a329ea7bb3c628497 BUG: 847839 Original Author: Venky Shankar <vshankar@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5127 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
2013-06-04 14:20:58 +05:30
xlators/features/changelog/Makefile
xlators/features/changelog/src/Makefile
xlators/features/changelog/lib/Makefile
xlators/features/changelog/lib/src/Makefile
xlators/features/glupy/Makefile
xlators/features/glupy/src/Makefile
xlators/features/locks/Makefile
xlators/features/locks/src/Makefile
xlators/features/quota/Makefile
xlators/features/quota/src/Makefile
xlators/features/marker/Makefile
xlators/features/marker/src/Makefile
xlators/features/read-only/Makefile
xlators/features/read-only/src/Makefile
xlators/features/compress/Makefile
xlators/features/compress/src/Makefile
xlators/features/mac-compat/Makefile
xlators/features/mac-compat/src/Makefile
xlators/features/quiesce/Makefile
xlators/features/quiesce/src/Makefile
xlators/features/index/Makefile
xlators/features/index/src/Makefile
xlators/features/protect/Makefile
xlators/features/protect/src/Makefile
xlators/features/gfid-access/Makefile
xlators/features/gfid-access/src/Makefile
xlators/playground/Makefile
xlators/playground/template/Makefile
xlators/playground/template/src/Makefile
xlators/encryption/Makefile
xlators/encryption/rot-13/Makefile
xlators/encryption/rot-13/src/Makefile
Transparent data encryption and metadata authentication .. in the systems with non-trusted server This new functionality can be useful in various cloud technologies. It is implemented via a special encryption/crypt translator,which works on the client side and performs encryption and authentication; 1. Class of supported algorithms The crypt translator can support any atomic symmetric block cipher algorithms (which require to pad plain/cipher text before performing encryption/decryption transform (see glossary in atom.c for definitions). In particular, it can support algorithms with the EOF issue (which require to pad the end of file by extra-data). Crypt translator performs translations user -> (offset, size) -> (aligned-offset, padded-size) ->server (and backward), and resolves individual FOPs (write(), truncate(), etc) to read-modify-write sequences. A volume can contain files encrypted by different algorithms of the mentioned class. To change some option value just reconfigure the volume. Currently only one algorithm is supported: AES_XTS. Example of algorithms, which can not be supported by the crypt translator: 1. Asymmetric block cipher algorithms, which inflate data, e.g. RSA; 2. Symmetric block cipher algorithms with inline MACs for data authentication. 2. Implementation notes. a) Atomic algorithms Since any process in a stackable file system manipulates with local data (which can be obsoleted by local data of another process), any atomic cipher algorithm without proper support can lead to non-POSIX behavior. To resolve the "collisions" we introduce locks: before performing FOP->read(), FOP->write(), etc. the process should first lock the file. b) Algorithms with EOF issue Such algorithms require to pad the end of file with some extra-data. Without proper support this will result in losing information about real file size. Keeping a track of real file size is a responsibility of the crypt translator. A special extended attribute with the name "trusted.glusterfs.crypt.att.size" is used for this purpose. All files contained in bricks of encrypted volume do have "padded" sizes. 3. Non-trusted servers and Metadata authentication We assume that server, where user's data is stored on is non-trusted. It means that the server can be subjected to various attacks directed to reveal user's encrypted personal data. We provide protection against such attacks. Every encrypted file has specific private attributes (cipher algorithm id, atom size, etc), which are packed to a string (so-called "format string") and stored as a special extended attribute with the name "trusted.glusterfs.crypt.att.cfmt". We protect the string from tampering. This protection is mandatory, hardcoded and is always on. Without such protection various attacks (based on extending the scope of per-file secret keys) are possible. Our authentication method has been developed in tight collaboration with Red Hat security team and is implemented as "metadata loader of version 1" (see file metadata.c). This method is NIST-compliant and is based on checking 8-byte per-hardlink MACs created(updated) by FOP->create(), FOP->link(), FOP->unlink(), FOP->rename() by the following unique entities: . file (hardlink) name; . verified file's object id (gfid). Every time, before manipulating with a file, we check it's MACs at FOP->open() time. Some FOPs don't require a file to be opened (e.g. FOP->truncate()). In such cases the crypt translator opens the file mandatory. 4. Generating keys Unique per-file keys are derived by NIST-compliant methods from the a) parent key; b) unique verified object-id of the file (gfid); Per-volume master key, provided by user at mount time is in the root of this "tree of keys". Those keys are used to: 1) encrypt/decrypt file data; 2) encrypt/decrypt file metadata; 3) create per-file and per-link MACs for metadata authentication. 5. Instructions Getting started with crypt translator Example: 1) Create a volume "myvol" and enable encryption: # gluster volume create myvol pepelac:/vols/xvol # gluster volume set myvol encryption on 2) Set location (absolute pathname) of your master key: # gluster volume set myvol encryption.master-key /home/me/mykey 3) Set other options to override default options, if needed. Start the volume. 4) On the client side make sure that the file /home/me/mykey exists and contains proper per-volume master key (that is 256-bit AES key). This key has to be in hex form, i.e. should be represented by 64 symbols from the set {'0', ..., '9', 'a', ..., 'f'}. The key should start at the beginning of the file. All symbols at offsets >= 64 are ignored. 5) Mount the volume "myvol" on the client side: # glusterfs --volfile-server=pepelac --volfile-id=myvol /mnt After successful mount the file which contains master key may be removed. NOTE: Keeping the master key between mount sessions is in user's competence. ********************************************************************** WARNING! Losing the master key will make content of all regular files inaccessible. Mount with improper master key allows to access content of directories: file names are not encrypted. ********************************************************************** 6. Options of crypt translator 1) "master-key": specifies location (absolute pathname) of the file which contains per-volume master key. There is no default location for master key. 2) "data-key-size": specifies size of per-file key for data encryption Possible values: . "256" default value . "512" 3) "block-size": specifies atom size. Possible values: . "512" . "1024" . "2048" . "4096" default value; 7. Test cases Any workload, which involves the following file operations: ->create(); ->open(); ->readv(); ->writev(); ->truncate(); ->ftruncate(); ->link(); ->unlink(); ->rename(); ->readdirp(). 8. TODOs: 1) Currently size of IOs issued by crypt translator is restricted by block_size (4K by default). We can use larger IOs to improve performance. Change-Id: I2601fe95c5c4dc5b22308a53d0cbdc071d5e5cee BUG: 1030058 Signed-off-by: Edward Shishkin <edward@redhat.com> Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4667 Tested-by: Gluster Build System <jenkins@build.gluster.com>
2013-03-13 21:56:46 +01:00
xlators/encryption/crypt/Makefile
xlators/encryption/crypt/src/Makefile
xlators/features/qemu-block/Makefile
xlators/features/qemu-block/src/Makefile
xlators/system/Makefile
xlators/system/posix-acl/Makefile
xlators/system/posix-acl/src/Makefile
xlators/nfs/Makefile
xlators/nfs/server/Makefile
xlators/nfs/server/src/Makefile
xlators/mgmt/Makefile
xlators/mgmt/glusterd/Makefile
xlators/mgmt/glusterd/src/Makefile
cli/Makefile
cli/src/Makefile
doc/Makefile
extras/Makefile
extras/init.d/Makefile
extras/init.d/glusterd.plist
extras/init.d/glusterd-Debian
extras/init.d/glusterd-Redhat
extras/init.d/glusterd-SuSE
extras/systemd/Makefile
extras/systemd/glusterd.service
extras/benchmarking/Makefile
extras/hook-scripts/Makefile
extras/ocf/Makefile
extras/ocf/glusterd
extras/ocf/volume
extras/LinuxRPM/Makefile
extras/geo-rep/Makefile
features/quota: Add the quota config xattr to newly added brick Issue: Quota directory limit configuration is stored in the xattrs. When a new brick is added these 'limit-set' xattrs have to be created to the directory in the new brick. This is done by the dht directory healing when the directory is created in the new brick. Since 'root' directory is already created DHT doesn't heal the limit-set xattr root. Solution: When the add-brick command is issued run the below hook script to heal the 'limit-set' xattr. The hook script does the following only if limit is configured on root. 1. Create an auxiliary mount. 2. getxattr 'limit-set' on the root 3. setxattr the same value on the root But this script needs the volume to be started to make the auxiliary mount. To handle the case when the add-brick is issued when the volume was stopped, symlink is created by the 'master' script to the corresponding location and these two are by default disabled. So, a 'master' script is added in the add-brick/pre. When add-brick command is issued, it enables one of the scripts mentioned above based on the condition, if volume is started - enable add-brick/post script else - enable start/post script After the actual script completes its job, it disables itself. Note: The enabling and disabling of the script is based on the glusterd's logic, that it only runs the scripts which starts its name with 'S'. So, Enable - symlink the file to 'S'* Disable - unlink the symlink. Change-Id: I2d3947a4d686c54417ec95f530af3bdd3444f4e2 BUG: 969461 Signed-off-by: Varun Shastry <vshastry@redhat.com> Reviewed-on: http://review.gluster.org/6104 Reviewed-by: Brian Foster <bfoster@redhat.com> Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
2013-10-15 17:25:51 +05:30
extras/hook-scripts/add-brick/Makefile
extras/hook-scripts/add-brick/pre/Makefile
extras/hook-scripts/add-brick/post/Makefile
contrib/fuse-util/Makefile
contrib/uuid/uuid_types.h
glusterfs-api.pc
features/changelog: changelog translator This is the initial version of the Changelog Translator. What is it ----------- Goal is to capture changes performed on a GlusterFS volume. The translator needs to be loaded on the server (bricks) and captures changes in a plain text file inside a configured directory path (controlled by "changelog-dir", should be somewhere in <export>/.glusterfs/changelog by default). Changes are classified into 3 types: - Data: : TYPE-I - Metadata : TYPE-II - Entry : TYPE-III Changelog file is rolled over after a certain time interval (defauls to 60 seconds) after which a changelog is started. The thing to be noted here is that for a time interval (time slice) multiple changes for an inode are recorded only once (ie. say for 100+ writes on an inode that happens within the time slice has only a single corresponding entry in the changelog file). That way we do not bloat up the changelog and also save lots of writes. Changelog Format ----------------- TYPE-I and TYPE-II changes have the gfid on the entity on which the operation happened. TYPE-III being a entry op requires the parent gfid and the basename. Changelog format has been kept to a minimal and it's upto the consumers to do the heavy loading of figuring out deletes, renames etc.. A single changelog file records all three types of changes, with each change starting with an identifier ("D": DATA, "M": METADATA and "E": ENTRY). Option is provided for the encoding type (See TUNABLES). Consumers ---------- The only consumer as of today would be geo-replication, although backup utilities, self-heal, bit-rot detection could be possible consumers in the future. CLI ---- By default, change-logging is disabled (the translator is present in the server graph but does nothing). When enabled (via cli) each brick starts to log the changes. There are a set of tunable that can be used to change the translators behaviour: - enable/disable changelog (disabled by default) gluster volume set <volume> changelog {on|off} - set the logging directory (<brick>/.glusterfs/changelogs is the default) gluster volume set <volume> changelog-dir /path/to/dir - select encoding type (binary (default) or ascii) gluster volume set <volume> encoding {binary|ascii} - change the rollover time for the logs (60 secs by default) gluster volume set <volume> rollover-time <secs> - when secs > 0, changelog file is not open()'d with O_SYNC flag - and fsync is trigerred periodically every <secs> seconds. gluster volume set <volume> fsync-interval <secs> features/changelog: changelog consumer library (libgfchangelog) A shared library is provided for the consumer of the changelogs for easy acess via APIs. Application can link against this library and request for changelog updates. Conversion of binary logs to human-readable ascii format is also taken care by the library which keeps a copy of the changelog in application provided working directory. Change-Id: I75575fb7f1c53d2bec3dba1a329ea7bb3c628497 BUG: 847839 Original Author: Venky Shankar <vshankar@redhat.com> Signed-off-by: Avra Sengupta <asengupt@redhat.com> Reviewed-on: http://review.gluster.org/5127 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
2013-06-04 14:20:58 +05:30
libgfchangelog.pc
api/Makefile
api/src/Makefile
api/examples/Makefile
api/examples/setup.py
geo-replication/Makefile
geo-replication/src/Makefile
geo-replication/syncdaemon/Makefile
glusterfs.spec])
2009-02-18 17:36:07 +05:30
AC_CANONICAL_HOST
AC_PROG_CC
AC_DISABLE_STATIC
2009-02-18 17:36:07 +05:30
AC_PROG_LIBTOOL
# Initialize CFLAGS before usage
AC_ARG_ENABLE([debug],
AC_HELP_STRING([--enable-debug],
[Enable debug build options.]))
if test "x$enable_debug" = "xyes"; then
BUILD_DEBUG=yes
CFLAGS="${CFLAGS} -g -O0 -DDEBUG"
else
BUILD_DEBUG=no
CFLAGS="${CFLAGS} -g -O2"
fi
AC_ARG_WITH(pkgconfigdir,
[ --with-pkgconfigdir=DIR pkgconfig file in DIR @<:@LIBDIR/pkgconfig@:>@],
[pkgconfigdir=$withval],
[pkgconfigdir='${libdir}/pkgconfig'])
AC_SUBST(pkgconfigdir)
AC_ARG_WITH(mountutildir,
[ --with-mountutildir=DIR mount helper utility in DIR @<:@/sbin@:>@],
[mountutildir=$withval],
[mountutildir='/sbin'])
AC_SUBST(mountutildir)
AC_ARG_WITH(systemddir,
[ --with-systemddir=DIR systemd service files in DIR @<:@/usr/lib/systemd/system@:>@],
[systemddir=$withval],
[systemddir='/usr/lib/systemd/system'])
AC_SUBST(systemddir)
AC_ARG_WITH(initdir,
[ --with-initdir=DIR init.d scripts in DIR @<:@/etc/init.d@:>@],
[initdir=$withval],
[initdir='/etc/init.d'])
AC_SUBST(initdir)
AC_ARG_WITH(launchddir,
[ --with-launchddir=DIR launchd services in DIR @<:@/Library/LaunchDaemons@:>@],
[launchddir=$withval],
[launchddir='/Library/LaunchDaemons'])
AC_SUBST(launchddir)
Add OCF compliant resource agents for glusterd and volumes These resource agents plug glusterd into Open Cluster Framework (OCF) compliant cluster resource managers, like Pacemaker. The glusterd RA is fairly trivial; it simply manages the glusterd daemon like any upstart or systemd job would, except that Pacemaker can do it in a cluster-aware fashion. The volume RA is a bit more involved; It starts a volume and monitors individual brick's daemons in a cluster aware fashion, recovering bricks when their processes fail. Note that this does NOT imply people would deploy GlusterFS servers in pairs, or anything of that nature. Pacemaker has the ability to deploy cluster resources as clones, meaning glusterd and volumes would be configured as follows in a Pacemaker cluster: primitive p_glusterd ocf:glusterfs:glusterd \ op monitor interval="30" primitive p_volume_demo ocf:glusterfs:volume \ params volname="demo" \ op monitor interval="10" clone cl_glusterd p_glusterd \ meta interleave="true" clone cl_volume_demo p_volume_demo \ meta interleave="true" ordered="true" colocation c_volume_on_glusterd inf: cl_volume_demo cl_glusterd order o_glusterd_before_volume 0: cl_glusterd cl_volume_demo The cluster status then looks as follows (in a 4-node cluster; note the configuration above could be applied, unchanged, to a cluster of any number of nodes): ============ Last updated: Fri Mar 30 10:54:50 2012 Last change: Thu Mar 29 17:20:17 2012 via crmd on gluster02.h Stack: openais Current DC: gluster03.h - partition with quorum Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558 4 Nodes configured, 4 expected votes 8 Resources configured. ============ Online: [ gluster02.h gluster03.h gluster04.h gluster01.h ] Clone Set: cl_glusterd [p_glusterd] Started: [ gluster02.h gluster03.h gluster04.h gluster01.h ] Clone Set: cl_volume_demo [p_volume_demo] Started: [ gluster01.h gluster02.h gluster03.h gluster04.h ] This is also a way of providing automatic glusterd and brick recovery in systems where neither upstart nor systemd are available. Change-Id: Ied46657bdfd2dd72dc97cf41b0eb7adcecacd18f BUG: 869559 Signed-off-by: Florian Haas <florian@hastexo.com> Reviewed-on: http://review.gluster.org/3043 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
2012-02-20 16:25:43 +01:00
AC_ARG_WITH([ocf],
[AS_HELP_STRING([--without-ocf], [build OCF-compliant cluster resource agents])],
Add OCF compliant resource agents for glusterd and volumes These resource agents plug glusterd into Open Cluster Framework (OCF) compliant cluster resource managers, like Pacemaker. The glusterd RA is fairly trivial; it simply manages the glusterd daemon like any upstart or systemd job would, except that Pacemaker can do it in a cluster-aware fashion. The volume RA is a bit more involved; It starts a volume and monitors individual brick's daemons in a cluster aware fashion, recovering bricks when their processes fail. Note that this does NOT imply people would deploy GlusterFS servers in pairs, or anything of that nature. Pacemaker has the ability to deploy cluster resources as clones, meaning glusterd and volumes would be configured as follows in a Pacemaker cluster: primitive p_glusterd ocf:glusterfs:glusterd \ op monitor interval="30" primitive p_volume_demo ocf:glusterfs:volume \ params volname="demo" \ op monitor interval="10" clone cl_glusterd p_glusterd \ meta interleave="true" clone cl_volume_demo p_volume_demo \ meta interleave="true" ordered="true" colocation c_volume_on_glusterd inf: cl_volume_demo cl_glusterd order o_glusterd_before_volume 0: cl_glusterd cl_volume_demo The cluster status then looks as follows (in a 4-node cluster; note the configuration above could be applied, unchanged, to a cluster of any number of nodes): ============ Last updated: Fri Mar 30 10:54:50 2012 Last change: Thu Mar 29 17:20:17 2012 via crmd on gluster02.h Stack: openais Current DC: gluster03.h - partition with quorum Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558 4 Nodes configured, 4 expected votes 8 Resources configured. ============ Online: [ gluster02.h gluster03.h gluster04.h gluster01.h ] Clone Set: cl_glusterd [p_glusterd] Started: [ gluster02.h gluster03.h gluster04.h gluster01.h ] Clone Set: cl_volume_demo [p_volume_demo] Started: [ gluster01.h gluster02.h gluster03.h gluster04.h ] This is also a way of providing automatic glusterd and brick recovery in systems where neither upstart nor systemd are available. Change-Id: Ied46657bdfd2dd72dc97cf41b0eb7adcecacd18f BUG: 869559 Signed-off-by: Florian Haas <florian@hastexo.com> Reviewed-on: http://review.gluster.org/3043 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
2012-02-20 16:25:43 +01:00
,
[OCF_SUBDIR='ocf'],
)
AC_SUBST(OCF_SUBDIR)
Add OCF compliant resource agents for glusterd and volumes These resource agents plug glusterd into Open Cluster Framework (OCF) compliant cluster resource managers, like Pacemaker. The glusterd RA is fairly trivial; it simply manages the glusterd daemon like any upstart or systemd job would, except that Pacemaker can do it in a cluster-aware fashion. The volume RA is a bit more involved; It starts a volume and monitors individual brick's daemons in a cluster aware fashion, recovering bricks when their processes fail. Note that this does NOT imply people would deploy GlusterFS servers in pairs, or anything of that nature. Pacemaker has the ability to deploy cluster resources as clones, meaning glusterd and volumes would be configured as follows in a Pacemaker cluster: primitive p_glusterd ocf:glusterfs:glusterd \ op monitor interval="30" primitive p_volume_demo ocf:glusterfs:volume \ params volname="demo" \ op monitor interval="10" clone cl_glusterd p_glusterd \ meta interleave="true" clone cl_volume_demo p_volume_demo \ meta interleave="true" ordered="true" colocation c_volume_on_glusterd inf: cl_volume_demo cl_glusterd order o_glusterd_before_volume 0: cl_glusterd cl_volume_demo The cluster status then looks as follows (in a 4-node cluster; note the configuration above could be applied, unchanged, to a cluster of any number of nodes): ============ Last updated: Fri Mar 30 10:54:50 2012 Last change: Thu Mar 29 17:20:17 2012 via crmd on gluster02.h Stack: openais Current DC: gluster03.h - partition with quorum Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558 4 Nodes configured, 4 expected votes 8 Resources configured. ============ Online: [ gluster02.h gluster03.h gluster04.h gluster01.h ] Clone Set: cl_glusterd [p_glusterd] Started: [ gluster02.h gluster03.h gluster04.h gluster01.h ] Clone Set: cl_volume_demo [p_volume_demo] Started: [ gluster01.h gluster02.h gluster03.h gluster04.h ] This is also a way of providing automatic glusterd and brick recovery in systems where neither upstart nor systemd are available. Change-Id: Ied46657bdfd2dd72dc97cf41b0eb7adcecacd18f BUG: 869559 Signed-off-by: Florian Haas <florian@hastexo.com> Reviewed-on: http://review.gluster.org/3043 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
2012-02-20 16:25:43 +01:00
2009-02-18 17:36:07 +05:30
# LEX needs a check
AC_PROG_LEX
if test "x${LEX}" != "xflex" -a "x${FLEX}" != "xlex"; then
AC_MSG_ERROR([Flex or lex required to build glusterfs.])
fi
dnl
dnl Word sizes...
dnl
AC_CHECK_SIZEOF(short)
AC_CHECK_SIZEOF(int)
AC_CHECK_SIZEOF(long)
AC_CHECK_SIZEOF(long long)
SIZEOF_SHORT=$ac_cv_sizeof_short
SIZEOF_INT=$ac_cv_sizeof_int
SIZEOF_LONG=$ac_cv_sizeof_long
SIZEOF_LONG_LONG=$ac_cv_sizeof_long_long
AC_SUBST(SIZEOF_SHORT)
AC_SUBST(SIZEOF_INT)
AC_SUBST(SIZEOF_LONG)
AC_SUBST(SIZEOF_LONG_LONG)
2009-02-18 17:36:07 +05:30
# YACC needs a check
AC_PROG_YACC
if test "x${YACC}" = "xbyacc" -o "x${YACC}" = "xyacc" -o "x${YACC}" = "x"; then
AC_MSG_ERROR([GNU Bison required to build glusterfs.])
fi
AC_CHECK_TOOL([LD],[ld])
Replace GPLV3 MD5 with OpenSSL MD5 Ric asked me to look at replacing the GPL licensed MD5 code with something better, i.e. perhaps faster, and with a less restrictive license, etc. So I took a couple hour holiday from working on wrapping up the client_t and did this. OpenSSL (nee SSLeay) is released under the OpenSSL license, a BSD/MIT style license. OpenSSL (libcrypto.so) is used on Linux, OS X and *BSD, Open Solaris, etc. IOW it's universally available on the platforms we care about. It's written by Eric Young (eay), now at EMC/RSA, and I can say from experience that the OpenSSL implementation of MD5 (at least) is every bit as fast as RSA's proprietary implementation (primarily because the implementations are very, very similar.) The last time I surveyed MD5 implementations I found they're all pretty much the same speed. I changed the APIs (and ABIs) for the strong and weak checksums. Strictly speaking I didn't need to do that. They're only called on short strings of data, i.e. pathnames, so using int32_t and uint32_t is ostensibly okay. My change is arguably a better, more general API for this sort of thing. It's also what bit me when gerrit/jenkins validation failed due to glusterfs segv-ing. (I didn't pay close enough attention to the implementation of the weak checksum. But it forced me to learn what gerrit/jenkins are doing and going forward I can do better testing before submitting to gerrit.) Now resubmitting with a BZ Change-Id: I545fade1604e74fc68399894550229bd57a5e0df BUG: 807718 Signed-off-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.com/3019 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
2012-03-27 11:14:23 -04:00
AC_CHECK_LIB([crypto], [MD5], , AC_MSG_ERROR([OpenSSL crypto library is required to build glusterfs]))
2009-02-18 17:36:07 +05:30
AC_CHECK_LIB([pthread], [pthread_mutex_init], , AC_MSG_ERROR([Posix threads library is required to build glusterfs]))
Replace GPLV3 MD5 with OpenSSL MD5 Ric asked me to look at replacing the GPL licensed MD5 code with something better, i.e. perhaps faster, and with a less restrictive license, etc. So I took a couple hour holiday from working on wrapping up the client_t and did this. OpenSSL (nee SSLeay) is released under the OpenSSL license, a BSD/MIT style license. OpenSSL (libcrypto.so) is used on Linux, OS X and *BSD, Open Solaris, etc. IOW it's universally available on the platforms we care about. It's written by Eric Young (eay), now at EMC/RSA, and I can say from experience that the OpenSSL implementation of MD5 (at least) is every bit as fast as RSA's proprietary implementation (primarily because the implementations are very, very similar.) The last time I surveyed MD5 implementations I found they're all pretty much the same speed. I changed the APIs (and ABIs) for the strong and weak checksums. Strictly speaking I didn't need to do that. They're only called on short strings of data, i.e. pathnames, so using int32_t and uint32_t is ostensibly okay. My change is arguably a better, more general API for this sort of thing. It's also what bit me when gerrit/jenkins validation failed due to glusterfs segv-ing. (I didn't pay close enough attention to the implementation of the weak checksum. But it forced me to learn what gerrit/jenkins are doing and going forward I can do better testing before submitting to gerrit.) Now resubmitting with a BZ Change-Id: I545fade1604e74fc68399894550229bd57a5e0df BUG: 807718 Signed-off-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.com/3019 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
2012-03-27 11:14:23 -04:00
2009-02-18 17:36:07 +05:30
AC_CHECK_FUNC([dlopen], [has_dlopen=yes], AC_CHECK_LIB([dl], [dlopen], , AC_MSG_ERROR([Dynamic linking library required to build glusterfs])))
AC_CHECK_FUNC([gettext], [has_gettext=yes], AC_CHECK_LIB([intl], [gettext], , AC_MSG_ERROR([gettext support is required to build glusterfs])))
2009-02-18 17:36:07 +05:30
AC_CHECK_HEADERS([sys/xattr.h])
AC_CHECK_HEADERS([sys/extattr.h])
Replace GPLV3 MD5 with OpenSSL MD5 Ric asked me to look at replacing the GPL licensed MD5 code with something better, i.e. perhaps faster, and with a less restrictive license, etc. So I took a couple hour holiday from working on wrapping up the client_t and did this. OpenSSL (nee SSLeay) is released under the OpenSSL license, a BSD/MIT style license. OpenSSL (libcrypto.so) is used on Linux, OS X and *BSD, Open Solaris, etc. IOW it's universally available on the platforms we care about. It's written by Eric Young (eay), now at EMC/RSA, and I can say from experience that the OpenSSL implementation of MD5 (at least) is every bit as fast as RSA's proprietary implementation (primarily because the implementations are very, very similar.) The last time I surveyed MD5 implementations I found they're all pretty much the same speed. I changed the APIs (and ABIs) for the strong and weak checksums. Strictly speaking I didn't need to do that. They're only called on short strings of data, i.e. pathnames, so using int32_t and uint32_t is ostensibly okay. My change is arguably a better, more general API for this sort of thing. It's also what bit me when gerrit/jenkins validation failed due to glusterfs segv-ing. (I didn't pay close enough attention to the implementation of the weak checksum. But it forced me to learn what gerrit/jenkins are doing and going forward I can do better testing before submitting to gerrit.) Now resubmitting with a BZ Change-Id: I545fade1604e74fc68399894550229bd57a5e0df BUG: 807718 Signed-off-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.com/3019 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
2012-03-27 11:14:23 -04:00
AC_CHECK_HEADERS([openssl/md5.h])
AC_CHECK_HEADERS([linux/falloc.h])
case $host_os in
darwin*)
if ! test "`/usr/bin/sw_vers | grep ProductVersion: | cut -f 2 | cut -d. -f2`" -ge 5; then
AC_MSG_ERROR([You need at least OS X 10.5 (Leopard) to build Glusterfs])
fi
;;
esac
2009-02-18 17:36:07 +05:30
dnl Mac OS X does not have spinlocks
AC_CHECK_FUNC([pthread_spin_init], [have_spinlock=yes])
if test "x${have_spinlock}" = "xyes"; then
AC_DEFINE(HAVE_SPINLOCK, 1, [define if found spinlock])
fi
AC_SUBST(HAVE_SPINLOCK)
dnl some os may not have GNU defined strnlen function
AC_CHECK_FUNC([strnlen], [have_strnlen=yes])
if test "x${have_strnlen}" = "xyes"; then
AC_DEFINE(HAVE_STRNLEN, 1, [define if found strnlen])
fi
AC_SUBST(HAVE_STRNLEN)
AC_CHECK_FUNC([setfsuid], [have_setfsuid=yes])
AC_CHECK_FUNC([setfsgid], [have_setfsgid=yes])
if test "x${have_setfsuid}" = "xyes" -a "x${have_setfsgid}" = "xyes"; then
AC_DEFINE(HAVE_SET_FSID, 1, [define if found setfsuid setfsgid])
fi
# FUSE section
AC_ARG_ENABLE([fuse-client],
AC_HELP_STRING([--disable-fuse-client],
[Do not build the fuse client. NOTE: you cannot mount glusterfs without the client]))
2009-02-18 17:36:07 +05:30
BUILD_FUSE_CLIENT=no
if test "x$enable_fuse_client" != "xno"; then
2009-02-18 17:36:07 +05:30
FUSE_CLIENT_SUBDIR=fuse
BUILD_FUSE_CLIENT="yes"
fi
bd: posix/multi-brick support to BD xlator Current BD xlator (block backend) has a few limitations such as * Creation of directories not supported * Supports only single brick * Does not use extended attributes (and client gfid) like posix xlator * Creation of special files (symbolic links, device nodes etc) not supported Basic limitation of not allowing directory creation is blocking oVirt/VDSM to consume BD xlator as part of Gluster domain since VDSM creates multi-level directories when GlusterFS is used as storage backend for storing VM images. To overcome these limitations a new BD xlator with following improvements is suggested. * New hybrid BD xlator that handles both regular files and block device files * The volume will have both POSIX and BD bricks. Regular files are created on POSIX bricks, block devices are created on the BD brick (VG) * BD xlator leverages exiting POSIX xlator for most POSIX calls and hence sits above the POSIX xlator * Block device file is differentiated from regular file by an extended attribute * The xattr 'user.glusterfs.bd' (BD_XATTR) plays a role in mapping a posix file to Logical Volume (LV). * When a client sends a request to set BD_XATTR on a posix file, a new LV is created and mapped to posix file. So every block device will have a representative file in POSIX brick with 'user.glusterfs.bd' (BD_XATTR) set. * Here after all operations on this file results in LV related operations. For example opening a file that has BD_XATTR set results in opening the LV block device, reading results in reading the corresponding LV block device. When BD xlator gets request to set BD_XATTR via setxattr call, it creates a LV and information about this LV is placed in the xattr of the posix file. xattr "user.glusterfs.bd" used to identify that posix file is mapped to BD. Usage: Server side: [root@host1 ~]# gluster volume create bdvol host1:/storage/vg1_info?vg1 host2:/storage/vg2_info?vg2 It creates a distributed gluster volume 'bdvol' with Volume Group vg1 using posix brick /storage/vg1_info in host1 and Volume Group vg2 using /storage/vg2_info in host2. [root@host1 ~]# gluster volume start bdvol Client side: [root@node ~]# mount -t glusterfs host1:/bdvol /media [root@node ~]# touch /media/posix It creates regular posix file 'posix' in either host1:/vg1 or host2:/vg2 brick [root@node ~]# mkdir /media/image [root@node ~]# touch /media/image/lv1 It also creates regular posix file 'lv1' in either host1:/vg1 or host2:/vg2 brick [root@node ~]# setfattr -n "user.glusterfs.bd" -v "lv" /media/image/lv1 [root@node ~]# Above setxattr results in creating a new LV in corresponding brick's VG and it sets 'user.glusterfs.bd' with value 'lv:<default-extent-size' [root@node ~]# truncate -s5G /media/image/lv1 It results in resizig LV 'lv1'to 5G New BD xlator code is placed in xlators/storage/bd directory. Also add volume-uuid to the VG so that same VG can't be used for other bricks/volumes. After deleting a gluster volume, one has to manually remove the associated tag using vgchange <vg-name> --deltag <trusted.glusterfs.volume-id:<volume-id>> Changes from previous version V5: * Removed support for delayed deleting of LVs Changes from previous version V4: * Consolidated the patches * Removed usage of BD_XATTR_SIZE and consolidated it in BD_XATTR. Changes from previous version V3: * Added support in FUSE to support full/linked clone * Added support to merge snapshots and provide information about origin * bd_map xlator removed * iatt structure used in inode_ctx. iatt is cached and updated during fsync/flush * aio support * Type and capabilities of volume are exported through getxattr Changes from version 2: * Used inode_context for caching BD size and to check if loc/fd is BD or not. * Added GlusterFS server offloaded copy and snapshot through setfattr FOP. As part of this libgfapi is modified. * BD xlator supports stripe * During unlinking if a LV file is already opened, its added to delete list and bd_del_thread tries to delete from this list when a last reference to that file is closed. Changes from previous version: * gfid is used as name of LV * ? is used to specify VG name for creating BD volume in volume create, add-brick. gluster volume create volname host:/path?vg * open-behind issue is fixed * A replicate brick can be added dynamically and LVs from source brick are replicated to destination brick * A distribute brick can be added dynamically and rebalance operation distributes existing LVs/files to the new brick * Thin provisioning support added. * bd_map xlator support retained * setfattr -n user.glusterfs.bd -v "lv" creates a regular LV and setfattr -n user.glusterfs.bd -v "thin" creates thin LV * Capability and backend information added to gluster volume info (and --xml) so that management tools can exploit BD xlator. * tracing support for bd xlator added TODO: * Add support to display snapshots for a given LV * Display posix filename for list-origin instead of gfid Change-Id: I00d32dfbab3b7c806e0841515c86c3aa519332f2 BUG: 1028672 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/4809 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
2013-11-13 22:44:42 +05:30
AC_ARG_ENABLE([bd-xlator],
AC_HELP_STRING([--enable-bd-xlator], [Build BD xlator]))
if test "x$enable_bd_xlator" != "xno"; then
AC_CHECK_LIB([lvm2app],
[lvm_init,lvm_lv_from_name],
[HAVE_BD_LIB="yes"],
[HAVE_BD_LIB="no"])
if test "x$HAVE_BD_LIB" = "xyes"; then
# lvm_lv_from_name() has been made public with lvm2-2.02.79
AC_CHECK_DECLS(
[lvm_lv_from_name],
[NEED_LVM_LV_FROM_NAME_DECL="no"],
[NEED_LVM_LV_FROM_NAME_DECL="yes"],
[[#include <lvm2app.h>]])
fi
fi
if test "x$enable_bd_xlator" = "xyes" -a "x$HAVE_BD_LIB" = "xno"; then
echo "BD xlator requested but required lvm2 development library not found."
exit 1
fi
BUILD_BD_XLATOR=no
if test "x${enable-bd-xlator}" != "xno" -a "x${HAVE_BD_LIB}" = "xyes"; then
BUILD_BD_XLATOR=yes
AC_DEFINE(HAVE_BD_XLATOR, 1, [define if lvm2app library found and bd xlator
enabled])
if test "x$NEED_LVM_LV_FROM_NAME_DECL" = "xyes"; then
AC_DEFINE(NEED_LVM_LV_FROM_NAME_DECL, 1, [defined if lvm_lv_from_name()
was not found in the lvm2app.h header, but can be linked])
fi
fi
AM_CONDITIONAL([ENABLE_BD_XLATOR], [test x$BUILD_BD_XLATOR = xyes])
Transparent data encryption and metadata authentication .. in the systems with non-trusted server This new functionality can be useful in various cloud technologies. It is implemented via a special encryption/crypt translator,which works on the client side and performs encryption and authentication; 1. Class of supported algorithms The crypt translator can support any atomic symmetric block cipher algorithms (which require to pad plain/cipher text before performing encryption/decryption transform (see glossary in atom.c for definitions). In particular, it can support algorithms with the EOF issue (which require to pad the end of file by extra-data). Crypt translator performs translations user -> (offset, size) -> (aligned-offset, padded-size) ->server (and backward), and resolves individual FOPs (write(), truncate(), etc) to read-modify-write sequences. A volume can contain files encrypted by different algorithms of the mentioned class. To change some option value just reconfigure the volume. Currently only one algorithm is supported: AES_XTS. Example of algorithms, which can not be supported by the crypt translator: 1. Asymmetric block cipher algorithms, which inflate data, e.g. RSA; 2. Symmetric block cipher algorithms with inline MACs for data authentication. 2. Implementation notes. a) Atomic algorithms Since any process in a stackable file system manipulates with local data (which can be obsoleted by local data of another process), any atomic cipher algorithm without proper support can lead to non-POSIX behavior. To resolve the "collisions" we introduce locks: before performing FOP->read(), FOP->write(), etc. the process should first lock the file. b) Algorithms with EOF issue Such algorithms require to pad the end of file with some extra-data. Without proper support this will result in losing information about real file size. Keeping a track of real file size is a responsibility of the crypt translator. A special extended attribute with the name "trusted.glusterfs.crypt.att.size" is used for this purpose. All files contained in bricks of encrypted volume do have "padded" sizes. 3. Non-trusted servers and Metadata authentication We assume that server, where user's data is stored on is non-trusted. It means that the server can be subjected to various attacks directed to reveal user's encrypted personal data. We provide protection against such attacks. Every encrypted file has specific private attributes (cipher algorithm id, atom size, etc), which are packed to a string (so-called "format string") and stored as a special extended attribute with the name "trusted.glusterfs.crypt.att.cfmt". We protect the string from tampering. This protection is mandatory, hardcoded and is always on. Without such protection various attacks (based on extending the scope of per-file secret keys) are possible. Our authentication method has been developed in tight collaboration with Red Hat security team and is implemented as "metadata loader of version 1" (see file metadata.c). This method is NIST-compliant and is based on checking 8-byte per-hardlink MACs created(updated) by FOP->create(), FOP->link(), FOP->unlink(), FOP->rename() by the following unique entities: . file (hardlink) name; . verified file's object id (gfid). Every time, before manipulating with a file, we check it's MACs at FOP->open() time. Some FOPs don't require a file to be opened (e.g. FOP->truncate()). In such cases the crypt translator opens the file mandatory. 4. Generating keys Unique per-file keys are derived by NIST-compliant methods from the a) parent key; b) unique verified object-id of the file (gfid); Per-volume master key, provided by user at mount time is in the root of this "tree of keys". Those keys are used to: 1) encrypt/decrypt file data; 2) encrypt/decrypt file metadata; 3) create per-file and per-link MACs for metadata authentication. 5. Instructions Getting started with crypt translator Example: 1) Create a volume "myvol" and enable encryption: # gluster volume create myvol pepelac:/vols/xvol # gluster volume set myvol encryption on 2) Set location (absolute pathname) of your master key: # gluster volume set myvol encryption.master-key /home/me/mykey 3) Set other options to override default options, if needed. Start the volume. 4) On the client side make sure that the file /home/me/mykey exists and contains proper per-volume master key (that is 256-bit AES key). This key has to be in hex form, i.e. should be represented by 64 symbols from the set {'0', ..., '9', 'a', ..., 'f'}. The key should start at the beginning of the file. All symbols at offsets >= 64 are ignored. 5) Mount the volume "myvol" on the client side: # glusterfs --volfile-server=pepelac --volfile-id=myvol /mnt After successful mount the file which contains master key may be removed. NOTE: Keeping the master key between mount sessions is in user's competence. ********************************************************************** WARNING! Losing the master key will make content of all regular files inaccessible. Mount with improper master key allows to access content of directories: file names are not encrypted. ********************************************************************** 6. Options of crypt translator 1) "master-key": specifies location (absolute pathname) of the file which contains per-volume master key. There is no default location for master key. 2) "data-key-size": specifies size of per-file key for data encryption Possible values: . "256" default value . "512" 3) "block-size": specifies atom size. Possible values: . "512" . "1024" . "2048" . "4096" default value; 7. Test cases Any workload, which involves the following file operations: ->create(); ->open(); ->readv(); ->writev(); ->truncate(); ->ftruncate(); ->link(); ->unlink(); ->rename(); ->readdirp(). 8. TODOs: 1) Currently size of IOs issued by crypt translator is restricted by block_size (4K by default). We can use larger IOs to improve performance. Change-Id: I2601fe95c5c4dc5b22308a53d0cbdc071d5e5cee BUG: 1030058 Signed-off-by: Edward Shishkin <edward@redhat.com> Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4667 Tested-by: Gluster Build System <jenkins@build.gluster.com>
2013-03-13 21:56:46 +01:00
# start encryption/crypt section
AC_CHECK_HEADERS([openssl/cmac.h], [have_cmac_h=yes], [have_cmac_h=no])
AC_ARG_ENABLE([crypt-xlator],
AC_HELP_STRING([--enable-crypt-xlator], [Build crypt encryption xlator]))
if test "x$enable_crypt_xlator" = "xyes" -a "x$have_cmac_h" = "xno"; then
echo "Encryption xlator requires OpenSSL with cmac.h"
exit 1
fi
BUILD_CRYPT_XLATOR=no
if test "x$enable_crypt_xlator" != "xno" -a "x$have_cmac_h" = "xyes"; then
BUILD_CRYPT_XLATOR=yes
AC_DEFINE(HAVE_CRYPT_XLATOR, 1, [enable building crypt encryption xlator])
fi
AM_CONDITIONAL([ENABLE_CRYPT_XLATOR], [test x$BUILD_CRYPT_XLATOR = xyes])
2009-02-18 17:36:07 +05:30
AC_SUBST(FUSE_CLIENT_SUBDIR)
# end FUSE section
2009-08-11 18:26:11 -07:00
# FUSERMOUNT section
AC_ARG_ENABLE([fusermount],
AC_HELP_STRING([--disable-fusermount],
[Use system's fusermount]))
2009-08-11 18:26:11 -07:00
BUILD_FUSERMOUNT="yes"
if test "x$enable_fusermount" = "xno"; then
BUILD_FUSERMOUNT="no"
else
2009-08-11 18:26:11 -07:00
AC_DEFINE(GF_FUSERMOUNT, 1, [Use our own fusermount])
FUSERMOUNT_SUBDIR="contrib/fuse-util"
2009-08-11 18:26:11 -07:00
fi
AC_SUBST(FUSERMOUNT_SUBDIR)
#end FUSERMOUNT section
# QEMU_BLOCK section
AC_ARG_ENABLE([qemu-block],
AC_HELP_STRING([--enable-qemu-block],
[Build QEMU Block formats translator]))
if test "x$enable_qemu_block" != "xno"; then
PKG_CHECK_MODULES([GLIB], [glib-2.0],
[HAVE_GLIB_2="yes"],
[HAVE_GLIB_2="no"])
fi
if test "x$enable_qemu_block" = "xyes" -a "x$HAVE_GLIB_2" = "xno"; then
echo "QEMU Block formats translator requires libglib-2.0, but missing."
exit 1
fi
BUILD_QEMU_BLOCK=no
if test "x${enable_qemu_block}" != "xno" -a "x${HAVE_GLIB_2}" = "xyes"; then
BUILD_QEMU_BLOCK=yes
AC_DEFINE(HAVE_QEMU_BLOCK, 1, [define if libglib-2.0 library found and QEMU
Block translator enabled])
fi
AM_CONDITIONAL([ENABLE_QEMU_BLOCK], [test x$BUILD_QEMU_BLOCK = xyes])
# end QEMU_BLOCK section
2009-08-11 18:26:11 -07:00
2009-02-18 17:36:07 +05:30
# EPOLL section
AC_ARG_ENABLE([epoll],
AC_HELP_STRING([--disable-epoll],
[Use poll instead of epoll.]))
2009-02-18 17:36:07 +05:30
BUILD_EPOLL=no
if test "x$enable_epoll" != "xno"; then
AC_CHECK_HEADERS([sys/epoll.h],
[BUILD_EPOLL=yes],
[BUILD_EPOLL=no])
2009-02-18 17:36:07 +05:30
fi
# end EPOLL section
# IBVERBS section
AC_ARG_ENABLE([ibverbs],
AC_HELP_STRING([--disable-ibverbs],
[Do not build the ibverbs transport]))
2009-02-18 17:36:07 +05:30
if test "x$enable_ibverbs" != "xno"; then
AC_CHECK_LIB([ibverbs],
[ibv_get_device_list],
[HAVE_LIBIBVERBS="yes"],
[HAVE_LIBIBVERBS="no"])
AC_CHECK_LIB([rdmacm], [rdma_create_id], [HAVE_RDMACM="yes"], [HAVE_RDMACM="no"])
2009-02-18 17:36:07 +05:30
fi
if test "x$enable_ibverbs" = "xyes"; then
if test "x$HAVE_LIBIBVERBS" = "xno"; then
echo "ibverbs-transport requested, but libibverbs is not present."
exit 1
fi
if test "x$HAVE_RDMACM" = "xno"; then
echo "ibverbs-transport requested, but librdmacm is not present."
exit 1
fi
2009-02-18 17:36:07 +05:30
fi
BUILD_RDMA=no
2009-02-18 17:36:07 +05:30
BUILD_IBVERBS=no
if test "x$enable_ibverbs" != "xno" -a "x$HAVE_LIBIBVERBS" = "xyes" -a "x$HAVE_RDMACM" = "xyes"; then
2009-02-18 17:36:07 +05:30
IBVERBS_SUBDIR=ib-verbs
BUILD_IBVERBS=yes
RDMA_SUBDIR=rdma
BUILD_RDMA=yes
2009-02-18 17:36:07 +05:30
fi
AC_SUBST(IBVERBS_SUBDIR)
AC_SUBST(RDMA_SUBDIR)
2009-02-18 17:36:07 +05:30
# end IBVERBS section
# SYNCDAEMON section
AC_ARG_ENABLE([georeplication],
AC_HELP_STRING([--disable-georeplication],
[Do not install georeplication components]))
BUILD_SYNCDAEMON=no
case $host_os in
linux*)
#do nothing
;;
netbsd*)
#do nothing
;;
*)
#disabling geo replication for non-linux platforms
enable_georeplication=no
;;
esac
SYNCDAEMON_COMPILE=0
if test "x$enable_georeplication" != "xno"; then
SYNCDAEMON_SUBDIR=geo-replication
SYNCDAEMON_COMPILE=1
BUILD_SYNCDAEMON="yes"
AM_PATH_PYTHON([2.4])
echo -n "checking if python is python 2.x... "
if echo $PYTHON_VERSION | grep ^2; then
:
else
echo no
AC_MSG_ERROR([only python 2.x is supported])
fi
echo -n "checking if python has ctypes support... "
if "$PYTHON" -c 'import ctypes' 2>/dev/null; then
echo yes
else
echo no
AC_MSG_ERROR([python does not have ctypes support])
fi
fi
AC_SUBST(SYNCDAEMON_COMPILE)
AC_SUBST(SYNCDAEMON_SUBDIR)
# end SYNCDAEMON section
# CDC xlator - check if libz is present if so enable HAVE_LIB_Z
echo -n "checking if libz is present... "
PKG_CHECK_MODULES([ZLIB], [zlib >= 1.2.0],
[echo "yes (features requiring zlib enabled)" AC_DEFINE(HAVE_LIB_Z, 1, [define if zlib is present])],
[echo "no"] )
AC_SUBST(LIBZ_CFLAGS)
AC_SUBST(LIBZ_LIBS)
# end CDC xlator secion
# check for systemtap/dtrace
BUILD_SYSTEMTAP=no
AC_MSG_CHECKING([whether to include systemtap tracing support])
AC_ARG_ENABLE([systemtap],
[AS_HELP_STRING([--enable-systemtap],
[Enable inclusion of systemtap trace support])],
[ENABLE_SYSTEMTAP="${enableval}"], [ENABLE_SYSTEMTAP="def"])
AM_CONDITIONAL([ENABLE_SYSTEMTAP], [test "x${ENABLE_SYSTEMTAP}" = "xyes"])
AC_MSG_RESULT(${ENABLE_SYSTEMTAP})
if test "x${ENABLE_SYSTEMTAP}" != "xno"; then
AC_CHECK_PROG(DTRACE, dtrace, "yes", "no")
AC_CHECK_HEADER([sys/sdt.h], [SDT_H_FOUND="yes"],
[SDT_H_FOUND="no"])
fi
if test "x${ENABLE_SYSTEMTAP}" = "xyes"; then
if test "x${DTRACE}" = "xno"; then
AC_MSG_ERROR([dtrace not found])
elif test "$x{SDT_H_FOUND}" = "xno"; then
AC_MSG_ERROR([systemtap support needs sys/sdt.h header])
fi
fi
if test "x${DTRACE}" = "xyes" -a "x${SDT_H_FOUND}" = "xyes"; then
AC_MSG_CHECKING([x"${DTRACE}"xy"${SDT_H_FOUND}"y])
AC_DEFINE([HAVE_SYSTEMTAP], [1], [Define to 1 if using probes.])
BUILD_SYSTEMTAP=yes
fi
# end of systemtap/dtrace
# xml-output
AC_ARG_ENABLE([xml-output],
AC_HELP_STRING([--disable-xml-output],
[Disable the xml output]))
BUILD_XML_OUTPUT="yes"
if test "x$enable_xml_output" != "xno"; then
#check if libxml is present if so enable HAVE_LIB_XML
m4_ifdef([AM_PATH_XML2],[AM_PATH_XML2([2.6.19])], [no_xml=yes])
if test "x${no_xml}" = "x"; then
AC_DEFINE([HAVE_LIB_XML], [1], [Define to 1 if using libxml2.])
else
AC_MSG_WARN([libxml2 devel libraries not found disabling XML support])
BUILD_XML_OUTPUT="no"
fi
else
BUILD_XML_OUTPUT="no"
2013-03-25 08:22:16 -04:00
fi
# end of xml-output
2009-02-18 17:36:07 +05:30
dnl FreeBSD > 5 has execinfo as a Ported library for giving a workaround
dnl solution to GCC backtrace functionality
AC_CHECK_HEADERS([execinfo.h], [have_backtrace=yes],
AC_CHECK_LIB([execinfo], [backtrace], [have_backtrace=yes]))
dnl AC_MSG_ERROR([libexecinfo not found libexecinfo required.])))
if test "x${have_backtrace}" = "xyes"; then
AC_DEFINE(HAVE_BACKTRACE, 1, [define if found backtrace])
fi
AC_SUBST(HAVE_BACKTRACE)
dnl glusterfs prints memory usage to stderr by sending it SIGUSR1
AC_CHECK_FUNC([malloc_stats], [have_malloc_stats=yes])
if test "x${have_malloc_stats}" = "xyes"; then
AC_DEFINE(HAVE_MALLOC_STATS, 1, [define if found malloc_stats])
fi
AC_SUBST(HAVE_MALLOC_STATS)
dnl Linux, Solaris, Cygwin
AC_CHECK_MEMBERS([struct stat.st_atim.tv_nsec])
dnl FreeBSD, NetBSD
AC_CHECK_MEMBERS([struct stat.st_atimespec.tv_nsec])
case $host_os in
*netbsd*)
CFLAGS="${CFLAGS} -D_INCOMPLETE_XOPEN_C063"
;;
esac
AC_CHECK_FUNC([linkat], [have_linkat=yes])
if test "x${have_linkat}" = "xyes"; then
AC_DEFINE(HAVE_LINKAT, 1, [define if found linkat])
fi
AC_SUBST(HAVE_LINKAT)
2009-02-18 17:36:07 +05:30
libglusterfs: Add monotonic clocking counter for timer thread gettimeofday() returns the current wall clock time and timezone. Using these functions in order to measure the passage of time (how long an operation took) therefore seems like a no-brainer. This time suffer's from some limitations: a. They have a low resolution: “High-performance” timing by definition, requires clock resolutions into the microseconds or better. b. They can jump forwards and backwards in time: Computer clocks all tick at slightly different rates, which causes the time to drift. Most systems have NTP enabled which periodically adjusts the system clock to keep them in sync with “actual” time. The adjustment can cause the clock to suddenly jump forward (artificially inflating your timing numbers) or jump backwards (causing your timing calculations to go negative or hugely positive). In such cases timer thread could go into an infinite loop. From 'man gettimeofday': ---------- .. .. The time returned by gettimeofday() is affected by discontinuous jumps in the system time (e.g., if the system administrator manually changes the system time). If you need a monotonically increasing clock, see clock_gettime(2). .. .. ---------- Rationale: For calculating interval timing for Timer thread, all that’s needed should be clock as a simple counter that increments at a stable rate. This is necessary to avoid the jumps which are caused by using "wall time", this counter must be monotonic that can never “tick” backwards, ever. Change-Id: I701d31e71a85a73d21a6c5cd15583e7a5a645eeb BUG: 1017993 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/6070 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
2013-10-10 04:19:16 -07:00
dnl check for Monotonic clock
AC_CHECK_FUNC([clock_gettime], [has_monotonic_clock=yes], AC_CHECK_LIB([rt], [clock_gettime], , AC_MSG_WARN([System doesn't have monotonic clock using contrib])))
2009-02-18 17:36:07 +05:30
dnl Check for argp
AC_CHECK_HEADER([argp.h], AC_DEFINE(HAVE_ARGP, 1, [have argp]))
AC_CONFIG_SUBDIRS(argp-standalone)
libglusterfs: Add monotonic clocking counter for timer thread gettimeofday() returns the current wall clock time and timezone. Using these functions in order to measure the passage of time (how long an operation took) therefore seems like a no-brainer. This time suffer's from some limitations: a. They have a low resolution: “High-performance” timing by definition, requires clock resolutions into the microseconds or better. b. They can jump forwards and backwards in time: Computer clocks all tick at slightly different rates, which causes the time to drift. Most systems have NTP enabled which periodically adjusts the system clock to keep them in sync with “actual” time. The adjustment can cause the clock to suddenly jump forward (artificially inflating your timing numbers) or jump backwards (causing your timing calculations to go negative or hugely positive). In such cases timer thread could go into an infinite loop. From 'man gettimeofday': ---------- .. .. The time returned by gettimeofday() is affected by discontinuous jumps in the system time (e.g., if the system administrator manually changes the system time). If you need a monotonically increasing clock, see clock_gettime(2). .. .. ---------- Rationale: For calculating interval timing for Timer thread, all that’s needed should be clock as a simple counter that increments at a stable rate. This is necessary to avoid the jumps which are caused by using "wall time", this counter must be monotonic that can never “tick” backwards, ever. Change-Id: I701d31e71a85a73d21a6c5cd15583e7a5a645eeb BUG: 1017993 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/6070 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
2013-10-10 04:19:16 -07:00
2009-02-18 17:36:07 +05:30
BUILD_ARGP_STANDALONE=no
if test "x${ac_cv_header_argp_h}" = "xno"; then
2009-02-18 17:36:07 +05:30
BUILD_ARGP_STANDALONE=yes
ARGP_STANDALONE_CPPFLAGS='-I${top_srcdir}/argp-standalone'
ARGP_STANDALONE_LDADD='${top_builddir}/argp-standalone/libargp.a'
fi
AC_SUBST(ARGP_STANDALONE_CPPFLAGS)
AC_SUBST(ARGP_STANDALONE_LDADD)
AC_CHECK_HEADER([malloc.h], AC_DEFINE(HAVE_MALLOC_H, 1, [have malloc.h]))
AC_CHECK_FUNC([llistxattr], [have_llistxattr=yes])
if test "x${have_llistxattr}" = "xyes"; then
AC_DEFINE(HAVE_LLISTXATTR, 1, [define if llistxattr exists])
fi
AC_CHECK_FUNC([fdatasync], [have_fdatasync=yes])
if test "x${have_fdatasync}" = "xyes"; then
AC_DEFINE(HAVE_FDATASYNC, 1, [define if fdatasync exists])
fi
AC_CHECK_FUNC([fallocate], [have_fallocate=yes])
if test "x${have_fallocate}" = "xyes"; then
AC_DEFINE(HAVE_FALLOCATE, 1, [define if fallocate exists])
fi
AC_CHECK_FUNC([posix_fallocate], [have_posix_fallocate=yes])
if test "x${have_posix_fallocate}" = "xyes"; then
AC_DEFINE(HAVE_POSIX_FALLOCATE, 1, [define if posix_fallocate exists])
fi
# Check the distribution where you are compiling glusterfs on
GF_DISTRIBUTION=
AC_CHECK_FILE([/etc/debian_version])
AC_CHECK_FILE([/etc/SuSE-release])
AC_CHECK_FILE([/etc/redhat-release])
if test "x$ac_cv_file__etc_debian_version" = "xyes"; then
GF_DISTRIBUTION=Debian
fi
if test "x$ac_cv_file__etc_SuSE_release" = "xyes"; then
GF_DISTRIBUTION=SuSE
fi
if test "x$ac_cv_file__etc_redhat_release" = "xyes"; then
GF_DISTRIBUTION=Redhat
fi
AC_SUBST(GF_DISTRIBUTION)
2009-02-18 17:36:07 +05:30
GF_HOST_OS=""
GF_LDFLAGS="-rdynamic"
# check for gcc -Werror=format-security
saved_CFLAGS=$CFLAGS
CFLAGS="-Wformat -Werror=format-security"
AC_MSG_CHECKING([whether $CC accepts -Werror=format-security])
AC_COMPILE_IFELSE([AC_LANG_PROGRAM()], [cc_werror_format_security=yes], [cc_werror_format_security=no])
echo $cc_werror_format_security
if test "x$cc_werror_format_security" = "xno"; then
CFLAGS="$saved_CFLAGS"
else
CFLAGS="$saved_CFLAGS $CFLAGS"
fi
# check for gcc -Werror=implicit-function-declaration
saved_CFLAGS=$CFLAGS
CFLAGS="-Werror=implicit-function-declaration"
AC_MSG_CHECKING([whether $CC accepts -Werror=implicit-function-declaration])
AC_COMPILE_IFELSE([AC_LANG_PROGRAM()], [cc_werror_implicit=yes], [cc_werror_implicit=no])
echo $cc_werror_implicit
if test "x$cc_werror_implicit" = "xno"; then
CFLAGS="$saved_CFLAGS"
else
CFLAGS="$saved_CFLAGS $CFLAGS"
fi
2009-02-18 17:36:07 +05:30
case $host_os in
linux*)
GF_HOST_OS="GF_LINUX_HOST_OS"
GF_CFLAGS="${ARGP_STANDALONE_CPPFLAGS}"
GF_GLUSTERFS_CFLAGS="${GF_CFLAGS}"
GF_LDADD="${ARGP_STANDALONE_LDADD}"
GF_FUSE_CFLAGS="-DFUSERMOUNT_DIR=\\\"\$(bindir)\\\""
;;
2009-02-18 17:36:07 +05:30
solaris*)
GF_HOST_OS="GF_SOLARIS_HOST_OS"
GF_CFLAGS="${ARGP_STANDALONE_CPPFLAGS} -D_REENTRANT -D_POSIX_PTHREAD_SEMANTICS -m64"
GF_LDFLAGS=""
GF_GLUSTERFS_CFLAGS="${GF_CFLAGS}"
GF_LDADD="${ARGP_STANDALONE_LDADD}"
GF_GLUSTERFS_LIBS="-lnsl -lresolv -lsocket"
BUILD_FUSE_CLIENT=no
FUSE_CLIENT_SUBDIR=""
;;
*netbsd*)
GF_HOST_OS="GF_BSD_HOST_OS"
GF_CFLAGS="${ARGP_STANDALONE_CPPFLAGS} -D_INCOMPLETE_XOPEN_C063"
GF_CFLAGS="${GF_CFLAGS} -DTHREAD_UNSAFE_BASENAME"
GF_CFLAGS="${GF_CFLAGS} -DTHREAD_UNSAFE_DIRNAME"
GF_GLUSTERFS_CFLAGS="${GF_CFLAGS}"
GF_LDADD="${ARGP_STANDALONE_LDADD}"
if test "x$ac_cv_header_execinfo_h" = "xyes"; then
GF_GLUSTERFS_LIBS="-lexecinfo"
fi
GF_FUSE_LDADD="-lperfuse"
BUILD_FUSE_CLIENT=yes
LEXLIB=""
;;
2009-02-18 17:36:07 +05:30
*bsd*)
GF_HOST_OS="GF_BSD_HOST_OS"
GF_CFLAGS="${ARGP_STANDALONE_CPPFLAGS}"
GF_CFLAGS="${GF_CFLAGS} -DTHREAD_UNSAFE_BASENAME"
GF_CFLAGS="${GF_CFLAGS} -DTHREAD_UNSAFE_DIRNAME"
GF_GLUSTERFS_CFLAGS="${GF_CFLAGS}"
GF_LDADD="${ARGP_STANDALONE_LDADD}"
if test "x$ac_cv_header_execinfo_h" = "xyes"; then
GF_GLUSTERFS_LIBS="-lexecinfo"
fi
BUILD_FUSE_CLIENT=no
;;
2009-02-18 17:36:07 +05:30
darwin*)
GF_HOST_OS="GF_DARWIN_HOST_OS"
LIBTOOL=glibtool
GF_CFLAGS="${ARGP_STANDALONE_CPPFLAGS} -D__DARWIN_64_BIT_INO_T -bundle -undefined suppress -flat_namespace -D_XOPEN_SOURCE"
GF_CFLAGS="${GF_CFLAGS} -DTHREAD_UNSAFE_BASENAME"
GF_CFLAGS="${GF_CFLAGS} -DTHREAD_UNSAFE_DIRNAME"
GF_GLUSTERFS_CFLAGS="${ARGP_STANDALONE_CPPFLAGS} -D__DARWIN_64_BIT_INO_T -undefined suppress -flat_namespace"
GF_LDADD="${ARGP_STANDALONE_LDADD}"
GF_FUSE_CFLAGS="-I\$(CONTRIBDIR)/macfuse"
;;
2009-02-18 17:36:07 +05:30
esac
# enable debug section
AC_ARG_ENABLE([debug],
AC_HELP_STRING([--enable-debug],
[Enable debug build options.]))
# syslog section
AC_ARG_ENABLE([syslog],
AC_HELP_STRING([--disable-syslog],
[Disable syslog for logging]))
USE_SYSLOG="yes"
if test "x$enable_syslog" != "xno"; then
AC_DEFINE(GF_USE_SYSLOG, 1, [Use syslog for logging])
else
USE_SYSLOG="no"
fi
AM_CONDITIONAL([ENABLE_SYSLOG], [test x$USE_SYSLOG = xyes])
#end syslog section
BUILD_READLINE=no
AC_CHECK_LIB([readline -lcurses],[readline],[RLLIBS="-lreadline -lcurses"])
AC_CHECK_LIB([readline -ltermcap],[readline],[RLLIBS="-lreadline -ltermcap"])
AC_CHECK_LIB([readline -lncurses],[readline],[RLLIBS="-lreadline -lncurses"])
if test "x$RLLIBS" != "x"; then
AC_DEFINE(HAVE_READLINE, 1, [readline enabled CLI])
BUILD_READLINE=yes
fi
BUILD_LIBAIO=no
AC_CHECK_LIB([aio],[io_setup],[LIBAIO="-laio"])
if test "x$LIBAIO" != "x"; then
AC_DEFINE(HAVE_LIBAIO, 1, [libaio based POSIX enabled])
BUILD_LIBAIO=yes
fi
# glupy section
BUILD_GLUPY=no
have_python2=no
have_Python_h=no
AM_PATH_PYTHON()
if echo $PYTHON_VERSION | grep ^2; then
have_python2=yes
fi
# Save flags before testing python
saved_CFLAGS=$CFLAGS
saved_CPPFLAGS=$CPPFLAGS
saved_LDFLAGS=$LDFLAGS
CFLAGS=`${PYTHON}-config --cflags`
CPPFLAGS=$CFLAGS
LDFLAGS="-L`${PYTHON}-config --prefix`/lib `${PYTHON}-config --ldflags`"
AC_CHECK_HEADERS([python$PYTHON_VERSION/Python.h],[have_Python_h=yes],[])
AC_ARG_ENABLE([glupy],
AS_HELP_STRING([--enable-glupy],
[build glupy]))
case x$enable_glupy in
xyes)
if test "x$have_python2" = "xyes" -a "x$have_Python_h" = "xyes"; then
BUILD_GLUPY=yes
PYTHONDEV_CFLAGS="$CFLAGS"
PYTHONDEV_CPPFLAGS="$CPPFLAGS"
PYTHONDEV_LDFLAGS="$LDFLAGS"
AC_SUBST(PYTHONDEV_CFLAGS)
AC_SUBST(PYTHONDEV_CPPFLAGS)
AC_SUBST(PYTHONDEV_LDFLAGS)
else
AC_MSG_ERROR([glupy requires python-devel/python-dev package and python2.x])
fi
;;
xno)
;;
*)
if test "x$have_python2" = "xyes" -a "x$have_Python_h" = "xyes"; then
BUILD_GLUPY=yes
PYTHONDEV_CFLAGS="$CFLAGS"
PYTHONDEV_CPPFLAGS="$CPPFLAGS"
PYTHONDEV_LDFLAGS="$LDFLAGS"
AC_SUBST(PYTHONDEV_CFLAGS)
AC_SUBST(PYTHONDEV_CPPFLAGS)
AC_SUBST(PYTHONDEV_LDFLAGS)
else
AC_MSG_WARN([
---------------------------------------------------------------------------------
cannot build glupy. python 2.x and python-devel/python-dev package are required.
---------------------------------------------------------------------------------])
fi
;;
esac
# Restore flags
CFLAGS=$saved_CFLAGS
CPPFLAGS=$saved_CPPFLAGS
LDFLAGS=$saved_LDFLAGS
if test "x$BUILD_GLUPY" = "xyes"; then
BUILD_PYTHON_INC=`$PYTHON -c "from distutils import sysconfig; print sysconfig.get_python_inc()"`
BUILD_PYTHON_LIB=python$PYTHON_VERSION
GLUPY_SUBDIR=glupy
GLUPY_SUBDIR_MAKEFILE=xlators/features/glupy/Makefile
GLUPY_SUBDIR_SRC_MAKEFILE=xlators/features/glupy/src/Makefile
echo "building glupy with -isystem $BUILD_PYTHON_INC -l $BUILD_PYTHON_LIB"
AC_SUBST(BUILD_PYTHON_INC)
AC_SUBST(BUILD_PYTHON_LIB)
AC_SUBST(GLUPY_SUBDIR)
AC_SUBST(GLUPY_SUBDIR_MAKEFILE)
AC_SUBST(GLUPY_SUBDIR_SRC_MAKEFILE)
fi
# end glupy section
AC_SUBST(CFLAGS)
# end enable debug section
2009-02-18 17:36:07 +05:30
AC_SUBST(GF_HOST_OS)
AC_SUBST([GF_GLUSTERFS_LIBS])
2009-02-18 17:36:07 +05:30
AC_SUBST(GF_GLUSTERFS_CFLAGS)
AC_SUBST(GF_CFLAGS)
AC_SUBST(GF_LDFLAGS)
AC_SUBST(GF_LDADD)
AC_SUBST(GF_FUSE_LDADD)
AC_SUBST(GF_FUSE_CFLAGS)
AC_SUBST(RLLIBS)
AC_SUBST(LIBAIO)
AC_SUBST(AM_MAKEFLAGS)
AC_SUBST(AM_LIBTOOLFLAGS)
2009-02-18 17:36:07 +05:30
2009-08-11 18:26:11 -07:00
CONTRIBDIR='$(top_srcdir)/contrib'
AC_SUBST(CONTRIBDIR)
GF_CPPDEFINES='-D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -D$(GF_HOST_OS)'
GF_CPPINCLUDES='-I$(top_srcdir)/libglusterfs/src -I$(CONTRIBDIR)/uuid'
GF_CPPFLAGS="$GF_CPPDEFINES $GF_CPPINCLUDES"
AC_SUBST([GF_CPPFLAGS])
Replace GPLV3 MD5 with OpenSSL MD5 Ric asked me to look at replacing the GPL licensed MD5 code with something better, i.e. perhaps faster, and with a less restrictive license, etc. So I took a couple hour holiday from working on wrapping up the client_t and did this. OpenSSL (nee SSLeay) is released under the OpenSSL license, a BSD/MIT style license. OpenSSL (libcrypto.so) is used on Linux, OS X and *BSD, Open Solaris, etc. IOW it's universally available on the platforms we care about. It's written by Eric Young (eay), now at EMC/RSA, and I can say from experience that the OpenSSL implementation of MD5 (at least) is every bit as fast as RSA's proprietary implementation (primarily because the implementations are very, very similar.) The last time I surveyed MD5 implementations I found they're all pretty much the same speed. I changed the APIs (and ABIs) for the strong and weak checksums. Strictly speaking I didn't need to do that. They're only called on short strings of data, i.e. pathnames, so using int32_t and uint32_t is ostensibly okay. My change is arguably a better, more general API for this sort of thing. It's also what bit me when gerrit/jenkins validation failed due to glusterfs segv-ing. (I didn't pay close enough attention to the implementation of the weak checksum. But it forced me to learn what gerrit/jenkins are doing and going forward I can do better testing before submitting to gerrit.) Now resubmitting with a BZ Change-Id: I545fade1604e74fc68399894550229bd57a5e0df BUG: 807718 Signed-off-by: Kaleb KEITHLEY <kkeithle@redhat.com> Reviewed-on: http://review.gluster.com/3019 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
2012-03-27 11:14:23 -04:00
AM_CONDITIONAL([GF_DARWIN_HOST_OS], test "${GF_HOST_OS}" = "GF_DARWIN_HOST_OS")
2009-02-18 17:36:07 +05:30
AM_CONDITIONAL([GF_INSTALL_VAR_LIB_GLUSTERD], test ! -d ${localstatedir}/lib/glusterd && test -d ${sysconfdir}/glusterd )
build: Start using library versioning for various libraries According to libtool three individual numbers stand for CURRENT:REVISION:AGE, or C:R:A for short. The libtool script typically tacks these three numbers onto the end of the name of the .so file it creates. The formula for calculating the file numbers on Linux and Solaris is /path/to/library/<library_name>.(C - A).(A).(R) As you release new versions of your library, you will update the library's C:R:A. Although the rules for changing these version numbers can quickly become confusing, a few simple tips should help keep you on track. The libtool documentation goes into greater depth. In essence, every time you make a change to the library and release it, the C:R:A should change. A new library should start with 0:0:0. Each time you change the public interface (i.e., your installed header files), you should increment the CURRENT number. This is called your interface number. The main use of this interface number is to tag successive revisions of your API. The AGE number is how many consecutive versions of the API the current implementation supports. Thus if the CURRENT library API is the sixth published version of the interface and it is also binary compatible with the fourth and fifth versions (i.e., the last two), the C:R:A might be 6:0:2. When you break binary compatibility, you need to set AGE to 0 and of course increment CURRENT. The REVISION marks a change in the source code of the library that doesn't affect the interface-for example, a minor bug fix. Anytime you increment CURRENT, you should set REVISION back to 0. Change-Id: Id72e74c1642c804fea6f93ec109135c7c16f1810 BUG: 862082 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/5645 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
2013-08-17 13:01:23 -07:00
dnl pkg-config versioning
GFAPI_VERSION="6.0.0"
build: Start using library versioning for various libraries According to libtool three individual numbers stand for CURRENT:REVISION:AGE, or C:R:A for short. The libtool script typically tacks these three numbers onto the end of the name of the .so file it creates. The formula for calculating the file numbers on Linux and Solaris is /path/to/library/<library_name>.(C - A).(A).(R) As you release new versions of your library, you will update the library's C:R:A. Although the rules for changing these version numbers can quickly become confusing, a few simple tips should help keep you on track. The libtool documentation goes into greater depth. In essence, every time you make a change to the library and release it, the C:R:A should change. A new library should start with 0:0:0. Each time you change the public interface (i.e., your installed header files), you should increment the CURRENT number. This is called your interface number. The main use of this interface number is to tag successive revisions of your API. The AGE number is how many consecutive versions of the API the current implementation supports. Thus if the CURRENT library API is the sixth published version of the interface and it is also binary compatible with the fourth and fifth versions (i.e., the last two), the C:R:A might be 6:0:2. When you break binary compatibility, you need to set AGE to 0 and of course increment CURRENT. The REVISION marks a change in the source code of the library that doesn't affect the interface-for example, a minor bug fix. Anytime you increment CURRENT, you should set REVISION back to 0. Change-Id: Id72e74c1642c804fea6f93ec109135c7c16f1810 BUG: 862082 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/5645 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
2013-08-17 13:01:23 -07:00
LIBGFCHANGELOG_VERSION="0.0.1"
AC_SUBST(GFAPI_VERSION)
AC_SUBST(LIBGFCHANGELOG_VERSION)
dnl libtool versioning
LIBGFXDR_LT_VERSION="0:1:0"
LIBGFRPC_LT_VERSION="0:1:0"
LIBGLUSTERFS_LT_VERSION="0:1:0"
LIBGFCHANGELOG_LT_VERSION="0:1:0"
GFAPI_LT_VERSION="6:0:0"
build: Start using library versioning for various libraries According to libtool three individual numbers stand for CURRENT:REVISION:AGE, or C:R:A for short. The libtool script typically tacks these three numbers onto the end of the name of the .so file it creates. The formula for calculating the file numbers on Linux and Solaris is /path/to/library/<library_name>.(C - A).(A).(R) As you release new versions of your library, you will update the library's C:R:A. Although the rules for changing these version numbers can quickly become confusing, a few simple tips should help keep you on track. The libtool documentation goes into greater depth. In essence, every time you make a change to the library and release it, the C:R:A should change. A new library should start with 0:0:0. Each time you change the public interface (i.e., your installed header files), you should increment the CURRENT number. This is called your interface number. The main use of this interface number is to tag successive revisions of your API. The AGE number is how many consecutive versions of the API the current implementation supports. Thus if the CURRENT library API is the sixth published version of the interface and it is also binary compatible with the fourth and fifth versions (i.e., the last two), the C:R:A might be 6:0:2. When you break binary compatibility, you need to set AGE to 0 and of course increment CURRENT. The REVISION marks a change in the source code of the library that doesn't affect the interface-for example, a minor bug fix. Anytime you increment CURRENT, you should set REVISION back to 0. Change-Id: Id72e74c1642c804fea6f93ec109135c7c16f1810 BUG: 862082 Signed-off-by: Harshavardhana <harsha@harshavardhana.net> Reviewed-on: http://review.gluster.org/5645 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Niels de Vos <ndevos@redhat.com> Reviewed-by: Vijay Bellur <vbellur@redhat.com>
2013-08-17 13:01:23 -07:00
AC_SUBST(LIBGFXDR_LT_VERSION)
AC_SUBST(LIBGFRPC_LT_VERSION)
AC_SUBST(LIBGLUSTERFS_LT_VERSION)
AC_SUBST(LIBGFCHANGELOG_LT_VERSION)
AC_SUBST(GFAPI_LT_VERSION)
2009-02-18 17:36:07 +05:30
AC_OUTPUT
echo
echo "GlusterFS configure summary"
echo "==========================="
echo "FUSE client : $BUILD_FUSE_CLIENT"
echo "Infiniband verbs : $BUILD_IBVERBS"
echo "epoll IO multiplex : $BUILD_EPOLL"
echo "argp-standalone : $BUILD_ARGP_STANDALONE"
echo "fusermount : $BUILD_FUSERMOUNT"
echo "readline : $BUILD_READLINE"
echo "georeplication : $BUILD_SYNCDAEMON"
echo "Linux-AIO : $BUILD_LIBAIO"
echo "Enable Debug : $BUILD_DEBUG"
echo "systemtap : $BUILD_SYSTEMTAP"
bd: posix/multi-brick support to BD xlator Current BD xlator (block backend) has a few limitations such as * Creation of directories not supported * Supports only single brick * Does not use extended attributes (and client gfid) like posix xlator * Creation of special files (symbolic links, device nodes etc) not supported Basic limitation of not allowing directory creation is blocking oVirt/VDSM to consume BD xlator as part of Gluster domain since VDSM creates multi-level directories when GlusterFS is used as storage backend for storing VM images. To overcome these limitations a new BD xlator with following improvements is suggested. * New hybrid BD xlator that handles both regular files and block device files * The volume will have both POSIX and BD bricks. Regular files are created on POSIX bricks, block devices are created on the BD brick (VG) * BD xlator leverages exiting POSIX xlator for most POSIX calls and hence sits above the POSIX xlator * Block device file is differentiated from regular file by an extended attribute * The xattr 'user.glusterfs.bd' (BD_XATTR) plays a role in mapping a posix file to Logical Volume (LV). * When a client sends a request to set BD_XATTR on a posix file, a new LV is created and mapped to posix file. So every block device will have a representative file in POSIX brick with 'user.glusterfs.bd' (BD_XATTR) set. * Here after all operations on this file results in LV related operations. For example opening a file that has BD_XATTR set results in opening the LV block device, reading results in reading the corresponding LV block device. When BD xlator gets request to set BD_XATTR via setxattr call, it creates a LV and information about this LV is placed in the xattr of the posix file. xattr "user.glusterfs.bd" used to identify that posix file is mapped to BD. Usage: Server side: [root@host1 ~]# gluster volume create bdvol host1:/storage/vg1_info?vg1 host2:/storage/vg2_info?vg2 It creates a distributed gluster volume 'bdvol' with Volume Group vg1 using posix brick /storage/vg1_info in host1 and Volume Group vg2 using /storage/vg2_info in host2. [root@host1 ~]# gluster volume start bdvol Client side: [root@node ~]# mount -t glusterfs host1:/bdvol /media [root@node ~]# touch /media/posix It creates regular posix file 'posix' in either host1:/vg1 or host2:/vg2 brick [root@node ~]# mkdir /media/image [root@node ~]# touch /media/image/lv1 It also creates regular posix file 'lv1' in either host1:/vg1 or host2:/vg2 brick [root@node ~]# setfattr -n "user.glusterfs.bd" -v "lv" /media/image/lv1 [root@node ~]# Above setxattr results in creating a new LV in corresponding brick's VG and it sets 'user.glusterfs.bd' with value 'lv:<default-extent-size' [root@node ~]# truncate -s5G /media/image/lv1 It results in resizig LV 'lv1'to 5G New BD xlator code is placed in xlators/storage/bd directory. Also add volume-uuid to the VG so that same VG can't be used for other bricks/volumes. After deleting a gluster volume, one has to manually remove the associated tag using vgchange <vg-name> --deltag <trusted.glusterfs.volume-id:<volume-id>> Changes from previous version V5: * Removed support for delayed deleting of LVs Changes from previous version V4: * Consolidated the patches * Removed usage of BD_XATTR_SIZE and consolidated it in BD_XATTR. Changes from previous version V3: * Added support in FUSE to support full/linked clone * Added support to merge snapshots and provide information about origin * bd_map xlator removed * iatt structure used in inode_ctx. iatt is cached and updated during fsync/flush * aio support * Type and capabilities of volume are exported through getxattr Changes from version 2: * Used inode_context for caching BD size and to check if loc/fd is BD or not. * Added GlusterFS server offloaded copy and snapshot through setfattr FOP. As part of this libgfapi is modified. * BD xlator supports stripe * During unlinking if a LV file is already opened, its added to delete list and bd_del_thread tries to delete from this list when a last reference to that file is closed. Changes from previous version: * gfid is used as name of LV * ? is used to specify VG name for creating BD volume in volume create, add-brick. gluster volume create volname host:/path?vg * open-behind issue is fixed * A replicate brick can be added dynamically and LVs from source brick are replicated to destination brick * A distribute brick can be added dynamically and rebalance operation distributes existing LVs/files to the new brick * Thin provisioning support added. * bd_map xlator support retained * setfattr -n user.glusterfs.bd -v "lv" creates a regular LV and setfattr -n user.glusterfs.bd -v "thin" creates thin LV * Capability and backend information added to gluster volume info (and --xml) so that management tools can exploit BD xlator. * tracing support for bd xlator added TODO: * Add support to display snapshots for a given LV * Display posix filename for list-origin instead of gfid Change-Id: I00d32dfbab3b7c806e0841515c86c3aa519332f2 BUG: 1028672 Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com> Reviewed-on: http://review.gluster.org/4809 Tested-by: Gluster Build System <jenkins@build.gluster.com> Reviewed-by: Anand Avati <avati@redhat.com>
2013-11-13 22:44:42 +05:30
echo "Block Device xlator : $BUILD_BD_XLATOR"
echo "glupy : $BUILD_GLUPY"
echo "Use syslog : $USE_SYSLOG"
echo "XML output : $BUILD_XML_OUTPUT"
echo "QEMU Block formats : $BUILD_QEMU_BLOCK"
Transparent data encryption and metadata authentication .. in the systems with non-trusted server This new functionality can be useful in various cloud technologies. It is implemented via a special encryption/crypt translator,which works on the client side and performs encryption and authentication; 1. Class of supported algorithms The crypt translator can support any atomic symmetric block cipher algorithms (which require to pad plain/cipher text before performing encryption/decryption transform (see glossary in atom.c for definitions). In particular, it can support algorithms with the EOF issue (which require to pad the end of file by extra-data). Crypt translator performs translations user -> (offset, size) -> (aligned-offset, padded-size) ->server (and backward), and resolves individual FOPs (write(), truncate(), etc) to read-modify-write sequences. A volume can contain files encrypted by different algorithms of the mentioned class. To change some option value just reconfigure the volume. Currently only one algorithm is supported: AES_XTS. Example of algorithms, which can not be supported by the crypt translator: 1. Asymmetric block cipher algorithms, which inflate data, e.g. RSA; 2. Symmetric block cipher algorithms with inline MACs for data authentication. 2. Implementation notes. a) Atomic algorithms Since any process in a stackable file system manipulates with local data (which can be obsoleted by local data of another process), any atomic cipher algorithm without proper support can lead to non-POSIX behavior. To resolve the "collisions" we introduce locks: before performing FOP->read(), FOP->write(), etc. the process should first lock the file. b) Algorithms with EOF issue Such algorithms require to pad the end of file with some extra-data. Without proper support this will result in losing information about real file size. Keeping a track of real file size is a responsibility of the crypt translator. A special extended attribute with the name "trusted.glusterfs.crypt.att.size" is used for this purpose. All files contained in bricks of encrypted volume do have "padded" sizes. 3. Non-trusted servers and Metadata authentication We assume that server, where user's data is stored on is non-trusted. It means that the server can be subjected to various attacks directed to reveal user's encrypted personal data. We provide protection against such attacks. Every encrypted file has specific private attributes (cipher algorithm id, atom size, etc), which are packed to a string (so-called "format string") and stored as a special extended attribute with the name "trusted.glusterfs.crypt.att.cfmt". We protect the string from tampering. This protection is mandatory, hardcoded and is always on. Without such protection various attacks (based on extending the scope of per-file secret keys) are possible. Our authentication method has been developed in tight collaboration with Red Hat security team and is implemented as "metadata loader of version 1" (see file metadata.c). This method is NIST-compliant and is based on checking 8-byte per-hardlink MACs created(updated) by FOP->create(), FOP->link(), FOP->unlink(), FOP->rename() by the following unique entities: . file (hardlink) name; . verified file's object id (gfid). Every time, before manipulating with a file, we check it's MACs at FOP->open() time. Some FOPs don't require a file to be opened (e.g. FOP->truncate()). In such cases the crypt translator opens the file mandatory. 4. Generating keys Unique per-file keys are derived by NIST-compliant methods from the a) parent key; b) unique verified object-id of the file (gfid); Per-volume master key, provided by user at mount time is in the root of this "tree of keys". Those keys are used to: 1) encrypt/decrypt file data; 2) encrypt/decrypt file metadata; 3) create per-file and per-link MACs for metadata authentication. 5. Instructions Getting started with crypt translator Example: 1) Create a volume "myvol" and enable encryption: # gluster volume create myvol pepelac:/vols/xvol # gluster volume set myvol encryption on 2) Set location (absolute pathname) of your master key: # gluster volume set myvol encryption.master-key /home/me/mykey 3) Set other options to override default options, if needed. Start the volume. 4) On the client side make sure that the file /home/me/mykey exists and contains proper per-volume master key (that is 256-bit AES key). This key has to be in hex form, i.e. should be represented by 64 symbols from the set {'0', ..., '9', 'a', ..., 'f'}. The key should start at the beginning of the file. All symbols at offsets >= 64 are ignored. 5) Mount the volume "myvol" on the client side: # glusterfs --volfile-server=pepelac --volfile-id=myvol /mnt After successful mount the file which contains master key may be removed. NOTE: Keeping the master key between mount sessions is in user's competence. ********************************************************************** WARNING! Losing the master key will make content of all regular files inaccessible. Mount with improper master key allows to access content of directories: file names are not encrypted. ********************************************************************** 6. Options of crypt translator 1) "master-key": specifies location (absolute pathname) of the file which contains per-volume master key. There is no default location for master key. 2) "data-key-size": specifies size of per-file key for data encryption Possible values: . "256" default value . "512" 3) "block-size": specifies atom size. Possible values: . "512" . "1024" . "2048" . "4096" default value; 7. Test cases Any workload, which involves the following file operations: ->create(); ->open(); ->readv(); ->writev(); ->truncate(); ->ftruncate(); ->link(); ->unlink(); ->rename(); ->readdirp(). 8. TODOs: 1) Currently size of IOs issued by crypt translator is restricted by block_size (4K by default). We can use larger IOs to improve performance. Change-Id: I2601fe95c5c4dc5b22308a53d0cbdc071d5e5cee BUG: 1030058 Signed-off-by: Edward Shishkin <edward@redhat.com> Signed-off-by: Anand Avati <avati@redhat.com> Reviewed-on: http://review.gluster.org/4667 Tested-by: Gluster Build System <jenkins@build.gluster.com>
2013-03-13 21:56:46 +01:00
echo "Encryption xlator : $BUILD_CRYPT_XLATOR"
2009-02-18 17:36:07 +05:30
echo